Database Reference
In-Depth Information
Table 9-2. Built-in MapReduce task counters
Counter
Description
Map input records
( MAP_INPUT_RECORDS )
The number of input records consumed by all the maps in the
job. Incremented every time a record is read from a Re-
cordReader and passed to the map's map() method by the
framework.
Split raw bytes ( SPLIT_RAW_BYTES )
The number of bytes of input-split objects read by maps. These
objects represent the split metadata (that is, the offset and length
within a file) rather than the split data itself, so the total size
should be small.
Map output records
( MAP_OUTPUT_RECORDS )
The number of map output records produced by all the maps in
the job. Incremented every time the collect() method is
called on a map's OutputCollector .
Map output bytes
( MAP_OUTPUT_BYTES )
The number of bytes of uncompressed output produced by all
the maps in the job. Incremented every time the collect()
method is called on a map's OutputCollector .
Map output materialized bytes
( MAP_OUTPUT_MATERIALIZED_BYTES )
The number of bytes of map output actually written to disk. If
map output compression is enabled, this is reflected in the
counter value.
Combine input records
( COMBINE_INPUT_RECORDS )
The number of input records consumed by all the combiners (if
any) in the job. Incremented every time a value is read from the
combiner's iterator over values. Note that this count is the num-
ber of values consumed by the combiner, not the number of dis-
tinct key groups (which would not be a useful metric, since
there is not necessarily one group per key for a combiner; see
Combiner Functions , and also Shuffle and Sort ).
Combine output records
( COMBINE_OUTPUT_RECORDS )
The number of output records produced by all the combiners (if
any) in the job. Incremented every time the collect() method
is called on a combiner's OutputCollector .
Reduce input groups
( REDUCE_INPUT_GROUPS )
The number of distinct key groups consumed by all the reducers
in the job. Incremented every time the reducer's reduce()
method is called by the framework.
Reduce input records
( REDUCE_INPUT_RECORDS )
The number of input records consumed by all the reducers in
the job. Incremented every time a value is read from the redu-
cer's iterator over values. If reducers consume all of their in-
puts, this count should be the same as the count for map output
records.
Search WWH ::




Custom Search