Database Reference
In-Depth Information
Table 7-2. Reduce-side tuning properties
Property name
Type Default
value
Description
int 5
The number of threads used to copy map outputs to
the reducer.
mapreduce.reduce.shuffle.parallelcopies
int 10
The number of times a reducer tries to fetch a map
output before reporting the error.
mapreduce.reduce.shuffle.maxfetchfailures
int 10
The maximum number of streams to merge at once
when sorting files. This property is also used in the
map.
mapreduce.task.io.sort.factor
mapreduce.reduce.shuffle.input.buffer.percent float 0.70
The proportion of total heap size to be allocated to
the map outputs buffer during the copy phase of the
shuffle.
float 0.66
The threshold usage proportion for the map outputs
buffer (defined by mapred.job.shuffle.in-
put.buffer.percent ) for starting the process of
merging the outputs and spilling to disk.
mapreduce.reduce.shuffle.merge.percent
int 1000 The threshold number of map outputs for starting the
process of merging the outputs and spilling to disk.
A value of 0 or less means there is no threshold, and
the spill behavior is governed solely by mapre-
duce.reduce.shuffle.merge.percent .
mapreduce.reduce.merge.inmem.threshold
float 0.0
The proportion of total heap size to be used for re-
taining map outputs in memory during the reduce.
For the reduce phase to begin, the size of map out-
puts in memory must be no more than this size. By
default, all map outputs are merged to disk before the
reduce begins, to give the reducers as much memory
as possible. However, if your reducers require less
memory, this value may be increased to minimize the
number of trips to disk.
mapreduce.reduce.input.buffer.percent
Search WWH ::




Custom Search