Databases Reference
In-Depth Information
OutputCollector<Text, DoubleWritable> output,
Reporter reporter) throws IOException {
double sum = 0;
int count = 0;
while (values.hasNext()) {
String fields[] = values.next().toString().split(",");
sum += Double.parseDouble(fields[0]);
count += Integer.parseInt(fields[1]);
}
output.collect(key, new DoubleWritable(sum/count));
}
}
The logic of the refactored MapReduce job was not too hard to follow, was it? We add-
ed an explicit count for each key/value pair. This refactoring allows the intermediate
data to be combined at each mapper before it's sent across the network.
Programmatically, the combiner must implement the Reducer interface. The
combiner's reduce() method performs the combining operation. This may seem like
a bad naming scheme, but recall that for the important class of distributive functions,
the combiner and the reducer perform the same operations. Therefore, the combiner
has adopted the reducer's signature to simplify its reuse. You don't have to rename
your Reduce class to use it as a combiner class. In addition, because the combiner is
performing an equivalent transformation, the type for the key/value pair in its output
must match that of its input. In the end, we've created a Combine class that looks
similar to the Reduce class, except it only outputs the (partial) sum and count at the
end, whereas the reducer computes the final average.
public static class Combine extends MapReduceBase
implements Reducer<Text, Text, Text, Text> {
public void reduce(Text key, Iterator<Text> values,
OutputCollector<Text, Text> output,
Reporter reporter) throws IOException {
double sum = 0;
int count = 0;
while (values.hasNext()) {
String fields[] = values.next().toString().split(",");
sum += Double.parseDouble(fields[0]);
count += Integer.parseInt(fields[1]);
}
output.collect(key, new Text(sum + "," + count));
}
}
To enable the combiner, the driver must specify the combiner's class to the JobConf
object. You can do this through the setCombinerClass() method. The driver sets
the mapper, combiner, and the reducer:
 
Search WWH ::




Custom Search