Database Reference
In-Depth Information
3.
Let's group name for all fields using the GROUP command:
namegroup = group name ALL;
4.
To count tweets for namegroup , run the FOREACH command with
the COUNT function:
tweetCount = FOREACH namegroup generate
COUNT(name);
dump tweetCount;
Figure 6-12 shows the output of dumping tweetCount onto the
console. The tweet count for screen name The News Selector is
6680.
Figure 6-12 . Output of dumping tweetCount on console
5.
To store output in a file, we can use the CSVExcelStorage func-
tion. To do that, we need to register it first with the Pig registry:
register '$PIG_HOME/contrib/piggybank/java/
piggybank.jar' ;
define CSVExcelStorage
org.apache.pig.piggybank.storage.CSVExcelStorage();
Upon running these commands over the Grunt shell, the function will
be registered and ready for use.
6.
Continuing the same exercise, we can also store the total tweet count
as follows:
totalGroup = group tweets ALL;
totalCount = foreach totalGroup generate
COUNT(tweets);
store totalCount into 'totalcount' using
CSVExcelStorage(',','YES_MULTILINE');
 
 
Search WWH ::




Custom Search