Information Technology Reference
In-Depth Information
EXERCISE 9.1 (continued)
$ hadoop dfs -rmr -skipTrash /user/myuser/docs/
To read a text file, run the following commands (you can avoid spilling text on the termi-
nal by using a pipe):
$ hadoop dfs -cat /user/myuser/docs/resume.txt
$ hadoop dfs -cat /user/myuser/docs/resume.txt | less
To read compressed (such as Zip) or encoded files (such as TextRecordInputStream ):
$ hadoop dfs -text /user/myuser/docs/compressed_report.zip
$ hadoop dfs -text /user/myuser/docs/compressed_report.zip | less
EXERCISE 9.2
Killing a Hadoop Job and Avoiding Zombie Processes
To kill a Hadoop job, the user needs the job ID. The job ID is printed when a Hadoop job
starts executing. Another, more-formal method is to use the Hadoop Web interface, also
known as the job tracker (for a single-node setup, accessible at http://localhost:50030 ).
The job tracker displays information about running jobs, retired or finished jobs, and killed
or failed jobs. To kill a job and avoid zombie processes, do the following:
$ hadoop job -kill <job-id>
EXERCISE 9.3
Resolving a Common IOException with HDFS
A common Java IOException can occur when the nodes are started or during the execu-
tion of a job. This happens due to HDFS's .Trash directory being full. To resolve this issue,
clear the HDFS .Trash directory and restart the cluster. Remember that this has to be done
through the namenode terminal because the namenode is the master node.
$ hadoop dfs -rmr /user/myuser/.Trash/*
$ /bin/hadoop-install-path/bin/stop-all.sh
$ /bin/hadoop-install-path/bin/start-all.sh
To check if the nodes (NameNode and DataNodes) have started, do the following on the
namenode terminal:
$ jps
 
Search WWH ::




Custom Search