Database Reference
In-Depth Information
$ git clone http://git-wip-us.apache.org/repos/asf/
incubator-samza.git
$ cd incubator-samza
$ ./gradlew -PscalaVersion=2.8.1 clean
publishToMavenLocal
When this is complete, the project containing the Job implementation
should be updated to include the samza-api dependency in its pom.xml
file:
<dependency>
<groupId>org.apache.samza</groupId>
<artifactId>samza-api</artifactId>
<version>0.7.0</version>
</dependency>
Configuring a Job
Samza's Job configurations are accomplished through the use of a
Properties file that is passed to the Samza framework when submitting the
Job to the YARN framework. The Properties file starts with a Job factory
class specification and a Job name, along with a distribution package:
job.factory.class=org.apache.samza.job.yarn.YarnJobFactory
job.name=wordcount-split
yarn.package.path=
file://${basedir}/target/
${project.artifactId}-${pom.version}-dist.tar.gz
The factory class will generally always be YarnJobFactory , and the name
is currently set to wordcount-split , which is implemented in the next
section. The yarn.package.path is filled in by the build process and
specifies the name of an archive that YARN transfers to each of the nodes.
This contains any support JAR files that might be needed along with the
code that implements the Job .
Next, the task is defined. This consists, minimally, of the task.class and
task .inputs properties:
Search WWH ::




Custom Search