Database Reference
In-Depth Information
@Override
public
int
run
(
String
[]
args
)
throws
Exception
{
if
(
args
.
length
!=
2
) {
System
.
err
.
printf
(
"Usage: %s [generic options] <input>
<output>\n"
,
getClass
().
getSimpleName
());
ToolRunner
.
printGenericCommandUsage
(
System
.
err
);
return
-
1
;
}
Job job
=
new
Job
(
getConf
(),
"Max temperature"
);
job
.
setJarByClass
(
getClass
());
job
.
getConfiguration
().
setBoolean
(
Job
.
MAPREDUCE_JOB_USER_CLASSPATH_FIRST
,
true
);
FileInputFormat
.
addInputPath
(
job
,
new
Path
(
args
[
0
]));
FileOutputFormat
.
setOutputPath
(
job
,
new
Path
(
args
[
1
]));
AvroJob
.
setMapOutputKeySchema
(
job
,
Schema
.
create
(
Schema
.
Type
.
INT
));
AvroJob
.
setMapOutputValueSchema
(
job
,
SCHEMA
);
AvroJob
.
setOutputKeySchema
(
job
,
SCHEMA
);
job
.
setInputFormatClass
(
TextInputFormat
.
class
);
job
.
setOutputFormatClass
(
AvroKeyOutputFormat
.
class
);
job
.
setMapperClass
(
MaxTemperatureMapper
.
class
);
job
.
setReducerClass
(
MaxTemperatureReducer
.
class
);
return
job
.
waitForCompletion
(
true
) ?
0
:
1
;
}
public static
void
main
(
String
[]
args
)
throws
Exception
{
int
exitCode
=
ToolRunner
.
run
(
new
AvroGenericMaxTemperature
(),
args
);
System
.
exit
(
exitCode
);
}
}
This program uses the Generic Avro mapping. This frees us from generating code to rep-
resent records, at the expense of type safety (field names are referred to by string value,
convenience (and read into the
SCHEMA
constant), although in practice it might be more
maintainable to read the schema from a local file in the driver code and pass it to the map-