MapReduce - Hadoop: The Definitive Guide

Database Reference

In-Depth Information

import org.apache.hadoop.io.LongWritable ;

import org.apache.hadoop.io.Text ;

import org.apache.hadoop.mapreduce.Mapper ;

public class MaxTemperatureMapper

extends Mapper < LongWritable , Text , Text , IntWritable > {

private static final int MISSING = 9999 ;

@Override

public void map ( LongWritable key , Text value , Context context )

throws IOException , InterruptedException {

String line = value . toString ();

String year = line . substring ( 15 , 19 );

int airTemperature ;

if ( line . charAt ( 87 ) == '+' ) { // parseInt doesn't like leading plus

signs

airTemperature = Integer . parseInt ( line . substring ( 88 , 92 ));

} else {

airTemperature = Integer . parseInt ( line . substring ( 87 , 92 ));

}

String quality = line . substring ( 92 , 93 );

if ( airTemperature != MISSING && quality . matches ( "[01459]" )) {

context . write ( new Text ( year ), new IntWritable ( airTemperature ));

}

The Mapper class is a generic type, with four formal type parameters that specify the in-

put key, input value, output key, and output value types of the map function. For the

present example, the input key is a long integer offset, the input value is a line of text, the

output key is a year, and the output value is an air temperature (an integer). Rather than

using built-in Java types, Hadoop provides its own set of basic types that are optimized

for network serialization. These are found in the org.apache.hadoop.io package.

Here we use LongWritable , which corresponds to a Java Long , Text (like Java

String ), and IntWritable (like Java Integer ).

The map() method is passed a key and a value. We convert the Text value containing

the line of input into a Java String , then use its substring() method to extract the

columns we are interested in.

The map() method also provides an instance of Context to write the output to. In this

case, we write the year as a Text object (since we are just using it as a key), and the tem-

Search WWH ::

Custom Search

Home