Database Reference
In-Depth Information
14/03/23 16:34:48 INFO mapred.MapTask: data buffer =
........
14/03/23 16:34:56 INFO mapred.JobClient: Spilled Records=256702
14/03/23 16:34:56 INFO mapred.JobClient: CPU time spent (ms)=0
14/03/23 16:34:56 INFO mapred.JobClient: Physical memory (bytes) snapshot=0
14/03/23 16:34:56 INFO mapred.JobClient: Virtual memory (bytes) snapshot=0
14/03/23 16:34:56 INFO mapred.JobClient: Total committed heap usage (bytes)=1507446784
Notice that the Hadoop jar command is very similar to that used in V1. You have specified an example jar file to
use, from which you will execute the word-count function. An input and output data directory on HDFS has also been
specified. Also, the run time is almost the same.
Okay, the Map Reduce job has finished, so you take a look at the output. In the edgar-results directory, there is a
_SUCCESS file to indicate a positive outcome and a part-r-00000 file that contains the reduced data:
[hadoop@hc1nn ~]$ hadoop fs -ls /user/hadoop/edgar-results
Found 2 items
-rw-r--r-- 2 hadoop hadoop 0 2014-03-23 16:34 /user/hadoop/edgar-results/_SUCCESS
-rw-r--r-- 2 hadoop hadoop 769870 2014-03-23 16:34 /user/hadoop/edgar-results/part-r-00000
The job was successful; you have part data. To examine the part file data, you need to extract it from HDFS. The
Hadoop file system cat command can be used to dump the contents of the part file. This will then be stored in the
Linux file system file /tmp/hadoop/part-r-00000:
[hadoop@hc1nn ~]$ mkdir -p /tmp/hadoop/
[hadoop@hc1nn ~]$ hadoop fs -cat /user/hadoop/edgar-results/part-r-00000 > /tmp/hadoop/part-r-00000
[hadoop@hc1nn ~]$ wc -l /tmp/hadoop/part-r-00000
67721 /tmp/hadoop/part-r-00000
If you use the Linux command wc -l to show the file lines, you'll see that there are 67,721 lines in the extracted
file. This is the same result as you received from the Map Reduce word-count job in the V1 example. To list the actual
data, you use:
[hadoop@hc1nn ~]$ head -20 /tmp/hadoop/part-r-00000
! 1
" 22
"''T 1
"'-- 1
"'A 1
"'After 1
"'Although 1
"'Among 2
"'And 2
"'Another 1
"'As 2
"'At 1
"'Aussi 1
"'Be 2
"'Being 1
"'But 1
"'But,' 1
Search WWH ::




Custom Search