Database Reference
In-Depth Information
JOB, TASK, AND TASK ATTEMPT IDS
In Hadoop 2, MapReduce job IDs are generated from YARN application IDs that are created by the
YARN resource manager. The format of an application ID is composed of the time that the resource
manager (not the application) started and an incrementing counter maintained by the resource manager to
uniquely identify the application to that instance of the resource manager. So the application with this
ID:
application_1410450250506_0003
is the third ( 0003 ; application IDs are 1-based) application run by the resource manager, which started
at the time represented by the timestamp 1410450250506 . The counter is formatted with leading zer-
os to make IDs sort nicely — in directory listings, for example. However, when the counter reaches
10000 , it is not reset, resulting in longer application IDs (which don't sort so well).
The corresponding job ID is created simply by replacing the application prefix of an application ID
with a job prefix:
job_1410450250506_0003
Tasks belong to a job, and their IDs are formed by replacing the job prefix of a job ID with a task pre-
fix and adding a suffix to identify the task within the job. For example:
task_1410450250506_0003_m_000003
is the fourth ( 000003 ; task IDs are 0-based) map ( m ) task of the job with ID
job_1410450250506_0003 . The task IDs are created for a job when it is initialized, so they do not
necessarily dictate the order in which the tasks will be executed.
Tasks may be executed more than once, due to failure (see Task Failure ) or speculative execution (see
Speculative Execution ), so to identify different instances of a task execution, task attempts are given
unique IDs. For example:
attempt_1410450250506_0003_m_000003_0
is the first ( 0 ; attempt IDs are 0-based) attempt at running task
task_1410450250506_0003_m_000003 . Task attempts are allocated during the job run as
needed, so their ordering represents the order in which they were created to run.
The MapReduce Web UI
Hadoop comes with a web UI for viewing information about your jobs. It is useful for fol-
lowing a job's progress while it is running, as well as finding job statistics and logs after
the job has completed. You can find the UI at http:// resource-manager-
host :8088/ .
Search WWH ::




Custom Search