Developing a MapReduce Application - Hadoop: The Definitive Guide

Database Reference

In-Depth Information

The lower part of the table shows the total number of failed and killed task attempts for

the map or reduce tasks. Task attempts may be marked as killed if they are speculative ex-

ecution duplicates, if the node they are running on dies, or if they are killed by a user. See

Task Failure for background on task failure.

There also are a number of useful links in the navigation. For example, the “Configura-

tion” link is to the consolidated configuration file for the job, containing all the properties

and their values that were in effect during the job run. If you are unsure of what a particu-

lar property was set to, you can click through to inspect the file.

Retrieving the Results

Once the job is finished, there are various ways to retrieve the results. Each reducer pro-

duces one output file, so there are 30 part files named part-r-00000 to part-r-00029 in the

max-temp directory.

NOTE

As their names suggest, a good way to think of these “part” files is as parts of the max-temp “file.”

If the output is large (which it isn't in this case), it is important to have multiple parts so that more than

one reducer can work in parallel. Usually, if a file is in this partitioned form, it can still be used easily

enough — as the input to another MapReduce job, for example. In some cases, you can exploit the struc-

ture of multiple partitions to do a map-side join, for example (see Map-Side Joins ).

This job produces a very small amount of output, so it is convenient to copy it from HDFS

to our development machine. The -getmerge option to the hadoop fs command is

useful here, as it gets all the files in the directory specified in the source pattern and

merges them into a single file on the local filesystem:

% hadoop fs -getmerge max-temp max-temp-local

% sort max-temp-local | tail

1991 607

1992 605

1993 567

1994 568

1995 567

1996 561

1997 565

1998 568

1999 568

2000 558

Search WWH ::

Custom Search

Home