Database Reference
In-Depth Information
3.
Let's group
name
for all fields using the
GROUP
command:
namegroup = group name ALL;
4.
To count tweets for
namegroup
, run the
FOREACH
command with
the
COUNT
function:
tweetCount = FOREACH namegroup generate
COUNT(name);
dump tweetCount;
console. The tweet count for screen name
The News Selector
is
6680.
Figure 6-12
.
Output of dumping tweetCount on console
5.
To store output in a file, we can use the
CSVExcelStorage
func-
tion. To do that, we need to register it first with the Pig registry:
register '$PIG_HOME/contrib/piggybank/java/
piggybank.jar' ;
define CSVExcelStorage
org.apache.pig.piggybank.storage.CSVExcelStorage();
Upon running these commands over the Grunt shell, the function will
be registered and ready for use.
6.
Continuing the same exercise, we can also store the total tweet count
as follows:
totalGroup = group tweets ALL;
totalCount = foreach totalGroup generate
COUNT(tweets);
store totalCount into 'totalcount' using
CSVExcelStorage(',','YES_MULTILINE');