Database Reference
In-Depth Information
}
job
.
setMapperClass
(
MaxTemperatureMapper
.
class
);
job
.
setPartitionerClass
(
FirstPartitioner
.
class
);
job
.
setSortComparatorClass
(
KeyComparator
.
class
);
job
.
setGroupingComparatorClass
(
GroupComparator
.
class
);
job
.
setReducerClass
(
MaxTemperatureReducer
.
class
);
job
.
setOutputKeyClass
(
IntPair
.
class
);
job
.
setOutputValueClass
(
NullWritable
.
class
);
return
job
.
waitForCompletion
(
true
) ?
0
:
1
;
}
public static
void
main
(
String
[]
args
)
throws
Exception
{
int
exitCode
=
ToolRunner
.
run
(
new
MaxTemperatureUsingSecondarySort
(),
args
);
System
.
exit
(
exitCode
);
}
}
In the mapper, we create a key representing the year and temperature, using an
IntPair
plementing a Custom Writable
.
) We don't need to carry any information in the value, be-
cause we can get the first (maximum) temperature in the reducer from the key, so we use a
NullWritable
. The reducer emits the first key, which, due to the secondary sorting, is
an
IntPair
for the year and its maximum temperature.
IntPair
's
toString()
method creates a tab-separated string, so the output is a set of tab-separated year-temperat-
ure pairs.
NOTE
Many applications need to access all the sorted values, not just the first value as we have provided here.
To do this, you need to populate the value fields since in the reducer you can retrieve only the first key.
This necessitates some unavoidable duplication of information between key and value.
We set the partitioner to partition by the first field of the key (the year) using a custom
partitioner called
FirstPartitioner
. To sort keys by year (ascending) and temperat-
ure (descending), we use a custom sort comparator, using
setSortComparator-
Class()
, that extracts the fields and performs the appropriate comparisons. Similarly, to
group keys by year, we set a custom comparator, using
setGroupingComparator-