Database Reference
In-Depth Information
Running Locally on Test Data
Now that we have the mapper and reducer working on controlled inputs, the next step is to
write a job driver and run it on some test data on a development machine.
Running a Job in a Local Job Runner
Using the
Tool
interface introduced earlier in the chapter, it's easy to write a driver to run
our MapReduce job for finding the maximum temperature by year (see
MaxTemperat-
ureDriver
in
Example 6-10
)
.
Example 6-10. Application to find the maximum temperature
public class
MaxTemperatureDriver
extends
Configured
implements
Tool
{
@Override
public
int
run
(
String
[]
args
)
throws
Exception
{
if
(
args
.
length
!=
2
) {
System
.
err
.
printf
(
"Usage: %s [generic options] <input> <output>\n"
,
getClass
().
getSimpleName
());
ToolRunner
.
printGenericCommandUsage
(
System
.
err
);
return
-
1
;
}
Job job
=
new
Job
(
getConf
(),
"Max temperature"
);
job
.
setJarByClass
(
getClass
());
FileInputFormat
.
addInputPath
(
job
,
new
Path
(
args
[
0
]));
FileOutputFormat
.
setOutputPath
(
job
,
new
Path
(
args
[
1
]));
job
.
setMapperClass
(
MaxTemperatureMapper
.
class
);
job
.
setCombinerClass
(
MaxTemperatureReducer
.
class
);
job
.
setReducerClass
(
MaxTemperatureReducer
.
class
);
job
.
setOutputKeyClass
(
Text
.
class
);
job
.
setOutputValueClass
(
IntWritable
.
class
);
return
job
.
waitForCompletion
(
true
) ?
0
:
1
;
}
public static
void
main
(
String
[]
args
)
throws
Exception
{
int
exitCode
=
ToolRunner
.
run
(
new
MaxTemperatureDriver
(),
args
);
System
.
exit
(
exitCode
);
}
}