The Old and New Java MapReduce APIs - Hadoop: The Definitive Guide

Database Reference

In-Depth Information

▪ Output files are named slightly differently: in the old API both map and reduce

outputs are named part- nnnnn , whereas in the new API map outputs are named

part-m- nnnnn and reduce outputs are named part-r- nnnnn (where nnnnn is an

integer designating the part number, starting from 00000).

▪ User-overridable methods in the new API are declared to throw

java.lang.InterruptedException . This means that you can write your

code to be responsive to interrupts so that the framework can gracefully cancel

long-running operations if it needs to. [ 169 ]

▪ In the new API, the reduce() method passes values as a

java.lang.Iterable , rather than a java.lang.Iterator (as the old

API does). This change makes it easier to iterate over the values using Java's

for - each loop construct:

for ( VALUEIN value : values ) { ... }

WARNING

Programs using the new API that were compiled against Hadoop 1 need to be recompiled to run against

Hadoop 2. This is because some classes in the new MapReduce API changed to interfaces between the

Hadoop 1 and Hadoop 2 releases. The symptom is an error at runtime like the following:

java.lang.IncompatibleClassChangeError: Found interface

org.apache.hadoop.mapreduce.TaskAttemptContext, but class was expected

Example D-1 shows the MaxTemperature application (from Java MapReduce ) rewrit-

ten to use the old API. The differences are highlighted in bold.

WARNING

When converting your Mapper and Reducer classes to the new API, don't forget to change the signa-

tures of the map() and reduce() methods to the new form. Just changing your class to extend the

new Mapper or Reducer classes will not produce a compilation error or warning, because these

classes provide identity forms of the map() and reduce() methods (respectively). Your mapper or re-

ducer code, however, will not be invoked, which can lead to some hard-to-diagnose errors.

Annotating your map() and reduce() methods with the @Override annotation will allow the Java

compiler to catch these errors.

Example D-1. Application to find the maximum temperature, using the old MapReduce API

public class OldMaxTemperature {

Search WWH ::

Custom Search

Home