Databases Reference
In-Depth Information
Figure 6.5
List of tasks in the TaskTracker
Web UI. This figure shows all the map tasks for a
single job. Each task can update its own status message.
Tasks are identified by a task ID. To construct the task ID, you start with the job ID the
task runs under but replace the
job_
prefix with
task_
. You then append it with
_m
for a
map task or
_r
for a reduce task. You further append it with an auto-incremented number
within each group. In the TaskTracker Web UI, you'll see each task with its status, which
you can programmatically set through the
setStatus()
method described earlier.
Clicking on a task ID will bring you to a page that further describes different
attempts
of
a task. Hadoop makes several retry attempts at a failed task before failing the entire job.
The JobTracker and TaskTracker UIs provide many other links and metrics. Most
should be self-explanatory.
KILLING JOBS
Unfortunately, sometimes a job goes awry after you've started it but it doesn't actually
fail. It may take a long time to run or may even be stuck in an infinite loop. In (pseudo-)
distributed mode you can manually kill a job using the command
bin/hadoop job -kill
job_id
where
job_id
is the job's ID
as given in JobTracker's Web UI.
6.2
Monitoring and debugging on a production cluster
After successfully running your job in a pseudo-distributed cluster, you're ready to run
it on a production cluster using real data. We can apply all the techniques we've used
for development and debugging on the production cluster, although the exact usage
may be slightly different. Your cluster should still have a JobTracker Web UI, but the
domain is no longer
localhost.
It's now the address of the cluster's JobTracker. The
port number will still be 50030 unless it's been configured differently.