Database Reference
In-Depth Information
</dependencies>
<build>
<finalName> hadoop-examples </finalName>
<plugins>
<plugin>
<groupId> org.apache.maven.plugins </groupId>
<artifactId> maven-compiler-plugin </artifactId>
<version> 3.1 </version>
<configuration>
<source> 1.6 </source>
<target> 1.6 </target>
</configuration>
</plugin>
<plugin>
<groupId> org.apache.maven.plugins </groupId>
<artifactId> maven-jar-plugin </artifactId>
<version> 2.5 </version>
<configuration>
<outputDirectory> ${basedir} </outputDirectory>
</configuration>
</plugin>
</plugins>
</build>
</project>
The dependencies section is the interesting part of the POM. (It is straightforward to use
another build tool, such as Gradle or Ant with Ivy, as long as you use the same set of de-
pendencies defined here.) For building MapReduce jobs, you only need to have the
hadoop-client dependency, which contains all the Hadoop client-side classes needed
to interact with HDFS and MapReduce. For running unit tests, we use junit , and for
writing MapReduce tests, we use mrunit . The hadoop-minicluster library con-
tains the “mini-” clusters that are useful for testing with Hadoop clusters running in a
single JVM.
Many IDEs can read Maven POMs directly, so you can just point them at the directory
containing the pom.xml file and start writing code. Alternatively, you can use Maven to
generate configuration files for your IDE. For example, the following creates Eclipse con-
figuration files so you can import the project into Eclipse:
% mvn eclipse:eclipse -DdownloadSources=true -DdownloadJavadocs=true
Managing Configuration
When developing Hadoop applications, it is common to switch between running the ap-
plication locally and running it on a cluster. In fact, you may have several clusters you
Search WWH ::




Custom Search