Database Reference
In-Depth Information
[hadoop@hc1nn edgar]$ ls -l
total 3868
-rw-rw-r--. 1 hadoop hadoop 632294 Feb 5 2004 10947-8.txt
-rw-r--r--. 1 hadoop hadoop 559342 Feb 23 2005 15143-8.txt
-rw-rw-r--. 1 hadoop hadoop 66409 Oct 27 2010 17192-8.txt
-rw-rw-r--. 1 hadoop hadoop 550284 Mar 16 2013 2147-8.txt
-rw-rw-r--. 1 hadoop hadoop 579834 Dec 31 2012 2148-8.txt
-rw-rw-r--. 1 hadoop hadoop 596745 Feb 17 2011 2149-8.txt
-rw-rw-r--. 1 hadoop hadoop 487087 Mar 27 2013 2150-8.txt
-rw-rw-r--. 1 hadoop hadoop 474746 Jul 1 2013 2151-8.txt
There are eight Linux text files in this directory that contain the test data. First, you copy this data from the Linux
file system into the HDFS directory /user/hadoop/edgar using the Hadoop file system copyFromLocal command:
[hadoop@hc1nn edgar]$ hadoop fs -copyFromLocal /tmp/edgar /user/hadoop/edgar
Now, you check the files that have been loaded to HDFS:
[hadoop@hc1nn edgar]$ hadoop dfs -ls /user/hadoop/edgar
Found 1 items
drwxr-xr-x - hadoop hadoop 0 2014-09-05 20:25 /user/hadoop/edgar/edgar
[hadoop@hc1nn edgar]$ hadoop dfs -ls /user/hadoop/edgar/edgar
Found 8 items
-rw-r--r-- 2 hadoop hadoop 632294 2014-03-16 13:50 /user/hadoop/edgar/edgar/10947-8.txt
-rw-r--r-- 2 hadoop hadoop 559342 2014-03-16 13:50 /user/hadoop/edgar/edgar/15143-8.txt
-rw-r--r-- 2 hadoop hadoop 66409 2014-03-16 13:50 /user/hadoop/edgar/edgar/17192-8.txt
-rw-r--r-- 2 hadoop hadoop 550284 2014-03-16 13:50 /user/hadoop/edgar/edgar/2147-8.txt
-rw-r--r-- 2 hadoop hadoop 579834 2014-03-16 13:50 /user/hadoop/edgar/edgar/2148-8.txt
-rw-r--r-- 2 hadoop hadoop 596745 2014-03-16 13:50 /user/hadoop/edgar/edgar/2149-8.txt
-rw-r--r-- 2 hadoop hadoop 487087 2014-03-16 13:50 /user/hadoop/edgar/edgar/2150-8.txt
-rw-r--r-- 2 hadoop hadoop 474746 2014-03-16 13:50 /user/hadoop/edgar/edgar/2151-8.txt
Next, you run the Map Reduce job, using the Hadoop jar command to pick up the word count from an examples
jar file. This will run a word count on the Edgar Allan Poe data:
[hadoop@hc1nn edgar]$ cd $HADOOP_PREFIX
[hadoop@hc1nn hadoop-1.2.1]$ hadoop jar ./hadoop-examples-1.2.1.jar wordcount
/user/hadoop/edgar /user/hadoop/edgar-results
This job executes the word-count task in the jar file hadoop-examples-1.2.1.jar. It takes data from HDFS under
/user/hadoop/edgar and outputs the results to /user/hadoop/edgar-results. The output of this command is as follows:
14/03/16 14:08:07 INFO input.FileInputFormat: Total input paths to process : 8
14/03/16 14:08:07 INFO util.NativeCodeLoader: Loaded the native-hadoop library
14/03/16 14:08:07 INFO mapred.JobClient: Running job: job_201403161357_0002
14/03/16 14:08:08 INFO mapred.JobClient: map 0% reduce 0%
14/03/16 14:08:18 INFO mapred.JobClient: map 12% reduce 0%
 
Search WWH ::




Custom Search