Database Reference
In-Depth Information
This shows that the path is good—it is the one that was just installed. Issuing the Pig help command is another a
good way to ensure that commands will run without error:
[hadoop@hc1nn ~]$ pig -help
Apache Pig version 0.12.1 (r1585011)
compiled Apr 05 2014, 01:41:34
USAGE: Pig [options] [-] : Run interactively in grunt shell.
Pig [options] -e[xecute] cmd [cmd ...] : Run cmd(s).
Pig [options] [-f[ile]] file : Run cmds found in file.
options include:
..............
-M, -no_multiquery - Turn multiquery optimization off; default is on
-P, -propertyFile - Path to property file
-printCmdDebug - Overrides anything else and prints the actual command used to run Pig, including
any environment variables that are set by the pig command.
The results are good: the pig command is located in /usr/local/pig/bin, and it runs as the help option shows. It is
now time to use it.
Running Pig
Pig lets you choose how you wish to work with it. For example, you can direct where Pig looks for data by specifying
the local mode or the Map Reduce mode (the default). Local mode takes all data from the local server and the file
system, while Map Reduce mode uses Hadoop. In addition, you can run tasks interactively or in batch mode. When
working interactively, you issue Pig commands via the Grunt command prompt. For larger scheduled or background
tasks, you can use batch mode. For the word-count demonstration, you will use Pig interactively in Map Reduce mode.
To prepare to use Pig, you first need to create a Pig working directory on HDFS:
[hadoop@hc1nn edgar]$ hadoop dfs -mkdir /user/hadoop/pig/
Then, you copy a text-based data file of Edgar Allan Poe's work into that HDFS-based directory from the Linux file
system by using the Hadoop file system command copyFromLocal :
[hadoop@hc1nn edgar]$ cd $HOME/edgar
[hadoop@hc1nn edgar]$ ls
10031.txt 15143.txt 17192.txt 2149.txt 932.txt
[hadoop@hc1nn edgar]$ hadoop dfs -copyFromLocal ./10031.txt /user/hadoop/pig
A quick check on HDFS shows that the file 10031.txt containing the text is now sitting on HDFS in the directory /
user/hadoop/pig:
[hadoop@hc1nn edgar]$ hadoop dfs -ls /user/hadoop/pig
Found 1 items
-rw-r--r-- 1 hadoop supergroup 410012 2014-06-18 12:29 /user/hadoop/pig/10031.txt
 
Search WWH ::




Custom Search