Database Reference
In-Depth Information
The main method at line 49 sets up the Map Reduce configuration by defining the type of input. In this case, the
input is text.
49 public static void main(String[] args) throws Exception
The code then defines the Map , Combine , and Reduce classes, as well as specifying the input/output formats:
51 JobConf conf = new JobConf(WordCount.class);
52 conf.setJobName("wordcount");
53
54 conf.setOutputKeyClass(Text.class);
55 conf.setOutputValueClass(IntWritable.class);
56
57 conf.setMapperClass(Map.class);
58 conf.setCombinerClass(Reduce.class);
59 conf.setReducerClass(Reduce.class);
60
61 conf.setInputFormat(TextInputFormat.class);
62 conf.setOutputFormat(TextOutputFormat.class);
63
64 FileInputFormat.setInputPaths(conf, new Path(args[0]));
65 FileOutputFormat.setOutputPath(conf, new Path(args[1]));
Finally, line 67 runs the job:
67 JobClient.runJob(conf);
Running the Example 1 Code
To compile this code, I used the Java compiler javac, which was installed with the JDK when Java 1.6 was installed. The
compiler expects the Java file name to match the class name, so I renamed the example code as WordCount.Java.
The classes on which this example relies are found in the Hadoop core library that is in the Hadoop release, so
I specified that when compiling the code. Also, I placed the compiled output into a subdirectory called wc_classes,
which can be used when building an example jar file.
[hadoop@hc1nn wordcount]$ cp wc-ex1.java WordCount.java
[hadoop@hc1nn wordcount]$ mkdir wc_classes
[hadoop@hc1nn wordcount]$ javac -classpath $HADOOP_PREFIX/hadoop-core-1.2.1.jar -d wc_classes
WordCount.java
The following recursive listing shows all of the subdirectories and classes from the build of the first example code:
[hadoop@hc1nn wordcount]$ ls -R wc_classes
wc_classes:
org
wc_classes/org:
myorg
wc_classes/org/myorg:
WordCount.class WordCount$Map.class WordCount$Reduce.class
 
Search WWH ::




Custom Search