Database Reference
In-Depth Information
JOB HISTORY
Job history refers to the events and configuration for a completed MapReduce job. It is retained regard-
less of whether the job was successful, in an attempt to provide useful information for the user running a
job.
Job history files are stored in HDFS by the MapReduce application master, in a directory set by the
mapreduce.jobhistory.done-dir property. Job history files are kept for one week before being
deleted by the system.
The history log includes job, task, and attempt events, all of which are stored in a file in JSON format.
The history for a particular job may be viewed through the web UI for the job history server (which is
linked to from the resource manager page) or via the command line using mapred job -history
(which you point at the job history file).
The MapReduce job page
Clicking on the link for the “Tracking UI” takes us to the application master's web UI (or
to the history page if the application has completed). In the case of MapReduce, this takes
us to the job page, illustrated in Figure 6-2 .
Figure 6-2. Screenshot of the job page
While the job is running, you can monitor its progress on this page. The table at the bot-
tom shows the map progress and the reduce progress. “Total” shows the total number of
map and reduce tasks for this job (a row for each). The other columns then show the state
of these tasks: “Pending” (waiting to run), “Running,” or “Complete” (successfully run).
Search WWH ::




Custom Search