Database Reference
In-Depth Information
choicesonthescreen,andWordCountisachoiceonthenextscreen.Onthis
screen, is the
Davinci.txt
download link.
Now, complete the following steps to verify Pig:
1. Download
Davinci.txt
or any text file into a new folder
C:\PigSource
. Put the data into HDFS:
hadoop fs -mkdir wordcount
2. Import the all files from the
SourceData
folder into the
flightinfo
folder:
hadoop fs -put c:\PigSource\davinci.txt wordcount/
3. To verify the file was copied as expected, run the
-ls
command:
hadoop fs -ls wordcount
Now let's log in to the Pig console.
4. Navigate to the
hadoop\pig-0.11.0.1.3.0.0-0380\bin
directory
and enter the following:
pig
You will find yourself at the grunt prompt:
Grunt>
5. To get used to the interactive console, type each one of the following
lines at the
Grunt
prompt. Press Enter after each line and the Grunt
command should reappear for the next line:
myinput = LOAD 'wordcount/davinci.txt' USING
TextLoader();
words = FOREACH myinput GENERATE
FLATTEN(TOKENIZE($0));
grouped = GROUP words BY \$0;
counts = foreach grouped generate GROUP,
count(words);