Database Reference
In-Depth Information
[hadoop@hc1nn wordcount]$ ls -l *.jar
-rw-rw-r--. 1 hadoop hadoop 5799 Jun 21 17:19 wordcount1.jar
The test data from the first example is still available on HDFS under the directory /user/hadoop/edgar; this is
shown by using the Hadoop file system ls command:
[hadoop@hc1nn wordcount]$ hadoop dfs -ls /user/hadoop/edgar
Found 5 items
-rw-r--r-- 1 hadoop supergroup 410012 2014-06-19 11:59 /user/hadoop/edgar/10031.txt
-rw-r--r-- 1 hadoop supergroup 559352 2014-06-19 11:59 /user/hadoop/edgar/15143.txt
-rw-r--r-- 1 hadoop supergroup 66401 2014-06-19 11:59 /user/hadoop/edgar/17192.txt
-rw-r--r-- 1 hadoop supergroup 596736 2014-06-19 11:59 /user/hadoop/edgar/2149.txt
-rw-r--r-- 1 hadoop supergroup 63278 2014-06-19 11:59 /user/hadoop/edgar/932.txt
To give this first example a thorough test, I also created a patterns file called patterns.txt that contains a series of
unwanted characters. I have dumped the contents of the file shown here by using the Linux cat command. Note that
some characters have an Escape character (\) at the start of the line to avoid processing errors for characters that Java
might consider to have special meaning. By using an Escape character, you will ensure that these patterns are just
treated as text:
[hadoop@hc1nn wordcount]$ cat patterns.txt
!
"
'
_
;
\(
\)
\#
\$
\&
\.
\,
\*
\-
\/
\{
\}
Copy the patterns.txt onto HDFS into the directory /user/hadoop/java by using the Hadoop file system
copyFromLocal command. Using the Hadoop file system ls command, list the patterns.txt file that is now on HDFS:
[hadoop@hc1nn wordcount]$ hadoop dfs -copyFromLocal ./patterns.txt /user/hadoop/java/patterns.txt
[hadoop@hc1nn wordcount]$ hadoop dfs -ls /user/hadoop/java
Found 1 items
-rw-r--r-- 1 hadoop supergroup 46 2014-06-21 17:29 /user/hadoop/java/patterns.txt
 
Search WWH ::




Custom Search