Database Reference
In-Depth Information
Table 9-2. Built-in MapReduce task counters
Counter
Description
Map input records
(
MAP_INPUT_RECORDS
)
The number of input records consumed by all the maps in the
job. Incremented every time a record is read from a
Re-
cordReader
and passed to the map's
map()
method by the
framework.
Split raw bytes (
SPLIT_RAW_BYTES
)
The number of bytes of input-split objects read by maps. These
objects represent the split metadata (that is, the offset and length
within a file) rather than the split data itself, so the total size
should be small.
Map output records
(
MAP_OUTPUT_RECORDS
)
The number of map output records produced by all the maps in
the job. Incremented every time the
collect()
method is
called on a map's
OutputCollector
.
Map output bytes
(
MAP_OUTPUT_BYTES
)
The number of bytes of uncompressed output produced by all
the maps in the job. Incremented every time the
collect()
method is called on a map's
OutputCollector
.
Map output materialized bytes
(
MAP_OUTPUT_MATERIALIZED_BYTES
)
The number of bytes of map output actually written to disk. If
map output compression is enabled, this is reflected in the
counter value.
Combine input records
(
COMBINE_INPUT_RECORDS
)
The number of input records consumed by all the combiners (if
any) in the job. Incremented every time a value is read from the
combiner's iterator over values. Note that this count is the num-
ber of values consumed by the combiner, not the number of dis-
tinct key groups (which would not be a useful metric, since
there is not necessarily one group per key for a combiner; see
Combiner Functions
,
and also
Shuffle and Sort
).
Combine output records
(
COMBINE_OUTPUT_RECORDS
)
The number of output records produced by all the combiners (if
any) in the job. Incremented every time the
collect()
method
is called on a combiner's
OutputCollector
.
Reduce input groups
(
REDUCE_INPUT_GROUPS
)
The number of distinct key groups consumed by all the reducers
in the job. Incremented every time the reducer's
reduce()
method is called by the framework.
Reduce input records
(
REDUCE_INPUT_RECORDS
)
The number of input records consumed by all the reducers in
the job. Incremented every time a value is read from the redu-
cer's iterator over values. If reducers consume all of their in-
puts, this count should be the same as the count for map output
records.