Storing and Configuring Data with Hadoop, YARN, and ZooKeeper - Big Data Made Easy: A Working Guide to the Complete Hadoop Toolset

Database Reference

In-Depth Information

[hadoop@hc1nn edgar]$ ls -l

total 3868

-rw-rw-r--. 1 hadoop hadoop 632294 Feb 5 2004 10947-8.txt

-rw-r--r--. 1 hadoop hadoop 559342 Feb 23 2005 15143-8.txt

-rw-rw-r--. 1 hadoop hadoop 66409 Oct 27 2010 17192-8.txt

-rw-rw-r--. 1 hadoop hadoop 550284 Mar 16 2013 2147-8.txt

-rw-rw-r--. 1 hadoop hadoop 579834 Dec 31 2012 2148-8.txt

-rw-rw-r--. 1 hadoop hadoop 596745 Feb 17 2011 2149-8.txt

-rw-rw-r--. 1 hadoop hadoop 487087 Mar 27 2013 2150-8.txt

-rw-rw-r--. 1 hadoop hadoop 474746 Jul 1 2013 2151-8.txt

There are eight Linux text files in this directory that contain the test data. First, you copy this data from the Linux

file system into the HDFS directory /user/hadoop/edgar using the Hadoop file system copyFromLocal command:

[hadoop@hc1nn edgar]$ hadoop fs -copyFromLocal /tmp/edgar /user/hadoop/edgar

Now, you check the files that have been loaded to HDFS:

[hadoop@hc1nn edgar]$ hadoop dfs -ls /user/hadoop/edgar

Found 1 items

drwxr-xr-x - hadoop hadoop 0 2014-09-05 20:25 /user/hadoop/edgar/edgar

[hadoop@hc1nn edgar]$ hadoop dfs -ls /user/hadoop/edgar/edgar

Found 8 items

-rw-r--r-- 2 hadoop hadoop 632294 2014-03-16 13:50 /user/hadoop/edgar/edgar/10947-8.txt

-rw-r--r-- 2 hadoop hadoop 559342 2014-03-16 13:50 /user/hadoop/edgar/edgar/15143-8.txt

-rw-r--r-- 2 hadoop hadoop 66409 2014-03-16 13:50 /user/hadoop/edgar/edgar/17192-8.txt

-rw-r--r-- 2 hadoop hadoop 550284 2014-03-16 13:50 /user/hadoop/edgar/edgar/2147-8.txt

-rw-r--r-- 2 hadoop hadoop 579834 2014-03-16 13:50 /user/hadoop/edgar/edgar/2148-8.txt

-rw-r--r-- 2 hadoop hadoop 596745 2014-03-16 13:50 /user/hadoop/edgar/edgar/2149-8.txt

-rw-r--r-- 2 hadoop hadoop 487087 2014-03-16 13:50 /user/hadoop/edgar/edgar/2150-8.txt

-rw-r--r-- 2 hadoop hadoop 474746 2014-03-16 13:50 /user/hadoop/edgar/edgar/2151-8.txt

Next, you run the Map Reduce job, using the Hadoop jar command to pick up the word count from an examples

jar file. This will run a word count on the Edgar Allan Poe data:

[hadoop@hc1nn edgar]$ cd $HADOOP_PREFIX

[hadoop@hc1nn hadoop-1.2.1]$ hadoop jar ./hadoop-examples-1.2.1.jar wordcount

/user/hadoop/edgar /user/hadoop/edgar-results

This job executes the word-count task in the jar file hadoop-examples-1.2.1.jar. It takes data from HDFS under

/user/hadoop/edgar and outputs the results to /user/hadoop/edgar-results. The output of this command is as follows:

14/03/16 14:08:07 INFO input.FileInputFormat: Total input paths to process : 8

14/03/16 14:08:07 INFO util.NativeCodeLoader: Loaded the native-hadoop library

14/03/16 14:08:07 INFO mapred.JobClient: Running job: job_201403161357_0002

14/03/16 14:08:08 INFO mapred.JobClient: map 0% reduce 0%

14/03/16 14:08:18 INFO mapred.JobClient: map 12% reduce 0%

Search WWH ::

Custom Search

Home