How MapReduce Works - Hadoop: The Definitive Guide

Database Reference

In-Depth Information

Figure 7-5. Efficiently merging 40 file segments with a merge factor of 10

During the reduce phase, the reduce function is invoked for each key in the sorted output.

The output of this phase is written directly to the output filesystem, typically HDFS. In the

case of HDFS, because the node manager is also running a datanode, the first block rep-

lica will be written to the local disk.

Configuration Tuning

We are now in a better position to understand how to tune the shuffle to improve MapRe-

duce performance. The relevant settings, which can be used on a per-job basis (except

where noted), are summarized in Tables 7-1 and 7-2 , along with the defaults, which are

good for general-purpose jobs.

The general principle is to give the shuffle as much memory as possible. However, there is

a trade-off, in that you need to make sure that your map and reduce functions get enough

memory to operate. This is why it is best to write your map and reduce functions to use as

little memory as possible — certainly they should not use an unbounded amount of

memory (avoid accumulating values in a map, for example).

The amount of memory given to the JVMs in which the map and reduce tasks run is set by

the mapred.child.java.opts property. You should try to make this as large as pos-

sible for the amount of memory on your task nodes; the discussion in Memory settings in

YARN and MapReduce goes through the constraints to consider.

On the map side, the best performance can be obtained by avoiding multiple spills to disk;

one is optimal. If you can estimate the size of your map outputs, you can set the mapre-

duce.task.io.sort.* properties appropriately to minimize the number of spills. In

particular, you should increase mapreduce.task.io.sort.mb if you can. There is a

MapReduce counter ( SPILLED_RECORDS ; see Counters ) that counts the total number of

records that were spilled to disk over the course of a job, which can be useful for tuning.

Note that the counter includes both map- and reduce-side spills.

On the reduce side, the best performance is obtained when the intermediate data can

reside entirely in memory. This does not happen by default, since for the general case all

the memory is reserved for the reduce function. But if your reduce function has light

memory requirements, setting mapreduce.reduce.merge.inmem.threshold to

0 and mapreduce.reduce.input.buffer.percent to 1.0 (or a lower value; see

Table 7-2 ) may bring a performance boost.

Search WWH ::

Custom Search

Home