MapReduce Features - Hadoop: The Definitive Guide

Database Reference

In-Depth Information

can be on the local filesystem, on HDFS, or on another Hadoop-readable filesystem (such

as S3). If no scheme is supplied, then the files are assumed to be local. (This is true even

when the default filesystem is not the local filesystem.)

You can also copy archive files (JAR files, ZIP files, tar files, and gzipped tar files) to

your tasks using the -archives option; these are unarchived on the task node. The -

libjars option will add JAR files to the classpath of the mapper and reducer tasks. This

is useful if you haven't bundled library JAR files in your job JAR file.

Let's see how to use the distributed cache to share a metadata file for station names. The

command we will run is:

% hadoop jar hadoop-examples.jar \

MaxTemperatureByStationNameUsingDistributedCacheFile \

-files input/ncdc/metadata/stations-fixed-width.txt input/ncdc/all

output

This command will copy the local file stations-fixed-width.txt (no scheme is supplied, so

the path is automatically interpreted as a local file) to the task nodes, so we can use it to

look up station names. The listing for MaxTemperatureBySta-

tionNameUsingDistributedCacheFile appears in Example 9-13 .

Example 9-13. Application to find the maximum temperature by station, showing station

names from a lookup table passed as a distributed cache file

public class MaxTemperatureByStationNameUsingDistributedCacheFile

extends Configured implements Tool {

static class StationTemperatureMapper

extends Mapper < LongWritable , Text , Text , IntWritable > {

private NcdcRecordParser parser = new NcdcRecordParser ();

@Override

protected void map ( LongWritable key , Text value , Context context )

throws IOException , InterruptedException {

parser . parse ( value );

if ( parser . isValidTemperature ()) {

context . write ( new Text ( parser . getStationId ()),

new IntWritable ( parser . getAirTemperature ()));

}

static class MaxTemperatureReducerWithStationLookup

Search WWH ::

Custom Search

Home