Database Reference
In-Depth Information
Chapter 13
Troubleshooting Job Failures
There are different types of jobs you can submit to your HDInsight cluster, and it is inevitable that you will run into
problems every now and then while doing so. Though most HDInsight jobs are internally executed as MapReduce
jobs, there are different techniques for troubleshooting high-level supporting projects like Hive, Pig, Oozie, and others
that make life easier for the developer. In this chapter, you will learn to troubleshoot the following types of failures:
MapReduce job failures
Hive job failures
Pig job failures
Sqoop job failures
Windows Azure Storage Blob failures
Cluster connectivity failures
MapReduce Jobs
All MapReduce job activities are logged by default in Hadoop in the C:\apps\dist\hadoop-1.2.0.1.3.1.0-06\logs\
directory of the name node. The log file name is of the format HADOOP-jobtracker-hostname.log . The most recent
data is in the .log file; older logs have their date appended to them. In each of the Data Nodes or Task Nodes, you will
also find a subdirectory named userlogs inside the C:\apps\dist\hadoop-1.2.0.1.3.1.0-06\logs\ folder.
This directory will have another subdirectory for every MapReduce task running in your Hadoop cluster. Each task
records its stdout (output) and stderr (error) to two files in this subdirectory. If you are running a multinode Hadoop
cluster, the logs you will find here are not centrally aggregated. To put together a complete picture, you will need to
check and verify each Task Node's /logs/userlogs/ directory for their output, and then create the full log history to
understand what went wrong in a particular job.
In a Hadoop cluster, the entire job submission, execution, and history-management process is done by three
types of services:
JobTracker JobTracker is the master of the system, and it manages the jobs and resources
in the cluster (TaskTrackers). The JobTracker schedules and coordinates with each of the
TaskTrackers that are launched to complete the jobs.
TaskTrackers These are the slave services deployed on Data Nodes or Task Nodes. They are
responsible for running the map and reduce tasks as instructed by the JobTracker.
 
Search WWH ::




Custom Search