Database Reference
In-Depth Information
Table 9-1. Built-in counter groups
Group
Name/Enum
Reference
MapReduce task
counters
Table 9-2
org.apache.hadoop.mapreduce.TaskCounter
Filesystem counters
org.apache.hadoop.mapreduce.FileSystemCounter
Table 9-3
Table 9-4
FileInputFormat
counters
org.apache.hadoop.mapreduce.lib.input.FileInputFormatCounter
org.apache.hadoop.mapreduce.lib.output.FileOutputFormatCounter
Table 9-5
FileOutputFormat
counters
Job counters
Table 9-6
org.apache.hadoop.mapreduce.JobCounter
Each group either contains
task counters
(which are updated as a task progresses) or
job
counters
(which are updated as a job progresses). We look at both types in the following
sections.
Task counters
Task counters gather information about tasks over the course of their execution, and the
results are aggregated over all the tasks in a job. The
MAP_INPUT_RECORDS
counter,
for example, counts the input records read by each map task and aggregates over all map
tasks in a job, so that the final figure is the total number of input records for the whole job.
Task counters are maintained by each task attempt, and periodically sent to the application
master so they can be globally aggregated. (This is described in
Progress and Status Up-
last transmission, since this guards against errors due to lost messages. Furthermore, dur-
ing a job run, counters may go down if a task fails.
Counter values are definitive only once a job has successfully completed. However, some
counters provide useful diagnostic information as a task is progressing, and it can be use-
ful to monitor them with the web UI. For example,
PHYSICAL_MEMORY_BYTES
,
VIRTUAL_MEMORY_BYTES
, and
COMMITTED_HEAP_BYTES
provide an indication of
how memory usage varies over the course of a particular task attempt.
The built-in task counters include those in the MapReduce task counters group (
Table 9-2
)