Database Reference
In-Depth Information
job . setInputFormatClass ( TextInputFormat . class );
job . setMapperClass ( Mapper . class );
job . setMapOutputKeyClass ( LongWritable . class );
job . setMapOutputValueClass ( Text . class );
job . setPartitionerClass ( HashPartitioner . class );
job . setNumReduceTasks ( 1 );
job . setReducerClass ( Reducer . class );
job . setOutputKeyClass ( LongWritable . class );
job . setOutputValueClass ( Text . class );
job . setOutputFormatClass ( TextOutputFormat . class );
return job . waitForCompletion ( true ) ? 0 : 1 ;
}
public static void main ( String [] args ) throws Exception {
int exitCode = ToolRunner . run ( new MinimalMapReduceWithDefaults (),
args );
System . exit ( exitCode );
}
}
We've simplified the first few lines of the run() method by extracting the logic for
printing usage and setting the input and output paths into a helper method. Almost all
MapReduce drivers take these two arguments (input and output), so reducing the boiler-
plate code here is a good thing. Here are the relevant methods in the JobBuilder class
for reference:
public static Job parseInputAndOutput ( Tool tool , Configuration
conf ,
String [] args ) throws IOException {
if ( args . length != 2 ) {
printUsage ( tool , "<input> <output>" );
return null ;
}
Job job = new Job ( conf );
job . setJarByClass ( tool . getClass ());
FileInputFormat . addInputPath ( job , new Path ( args [ 0 ]));
FileOutputFormat . setOutputPath ( job , new Path ( args [ 1 ]));
return job ;
Search WWH ::




Custom Search