Database Reference
In-Depth Information
Running Pig Interactively with Grunt
Fromthe bin folderofthePiginstallfolder( hdp\hadoop\pig\bin ),open
the Pig command-line console to launch the Grunt shell. The Grunt shell
enables you to run Pig Latin interactively and view the results of each step.
Enter the following script to load and create a schema for the traffic data:
SpeedData = LOAD '/user/test/traffic.txt'
using PigStorage() AS (dtstamp:chararray,
sensorid:int, speed:double);
Dump the results to the screen:
DUMP SpeedData;
By doing so, you can run a map-reduce job that outputs the data to the
console window. You should see data similar to Figure 9.7 , which shows the
tuples that make the set of data.
Figure 9.7 Dumping results to the console window
Using PiggyBank to Extract Time Periods
The next step in analyzing the data is to group it into different date/time
buckets. To accomplish this, you use functions defined in the
piggybank.jar file. If that file is not already installed, you can either
download and compile the source code or download a compiled jar file
from www.wiley.com/go/microsoftbigdatasolutions . Along with the
piggybank.jar file, you need to get a copy of the joda-time-2.2.jar
 
 
Search WWH ::




Custom Search