MapReduce Types and Formats - Hadoop: The Definitive Guide

Database Reference

In-Depth Information

Job job = new Job ( getConf ());

job . setJarByClass ( getClass ());

FileInputFormat . addInputPath ( job , new Path ( args [ 0 ]));

FileOutputFormat . setOutputPath ( job , new Path ( args [ 1 ]));

return job . waitForCompletion ( true ) ? 0 : 1 ;

}

public static void main ( String [] args ) throws Exception {

int exitCode = ToolRunner . run ( new MinimalMapReduce (), args );

System . exit ( exitCode );

}

The only configuration that we set is an input path and an output path. We run it over a

subset of our weather data with the following:

% hadoop MinimalMapReduce "input/ncdc/all/190{1,2}.gz" output

We do get some output: one file named part-r-00000 in the output directory. Here's what

the first few lines look like (truncated to fit the page):

0→0029029070999991901010106004+64333+023450FM-12+000599999V0202701N01591...

0→0035029070999991902010106004+64333+023450FM-12+000599999V0201401N01181...

135→0029029070999991901010113004+64333+023450FM-12+000599999V0202901N00821...

141→0035029070999991902010113004+64333+023450FM-12+000599999V0201401N01181...

270→0029029070999991901010120004+64333+023450FM-12+000599999V0209991C00001...

282→0035029070999991902010120004+64333+023450FM-12+000599999V0201401N01391...

Each line is an integer followed by a tab character, followed by the original weather data

record. Admittedly, it's not a very useful program, but understanding how it produces its

output does provide some insight into the defaults that Hadoop uses when running

MapReduce jobs. Example 8-1 shows a program that has exactly the same effect as Min-

imalMapReduce , but explicitly sets the job settings to their defaults.

Example 8-1. A minimal MapReduce driver, with the defaults explicitly set

public class MinimalMapReduceWithDefaults extends Configured implements

Tool {

@Override

public int run ( String [] args ) throws Exception {

Job job = JobBuilder . parseInputAndOutput ( this , getConf (), args );

if ( job == null ) {

return - 1 ;

}

Search WWH ::

Custom Search

Home