Database Reference
In-Depth Information
-------+-------
Alice | 377
cqlsh:testks> select * from resultCF where key = 'Hatter';
KEY | count
--------+-------
Hatter | 54
cqlsh:testks> select * from resultCF where key = 'Cat';
KEY | count
-----+-------
Cat | 23
There is a small difference in counting of the words, but that's likely due to the split that I
use and the split function that Pig uses.
Note that the Pig Latin that we have used here may be very inefficient. The purpose of this
example is to show the Cassandra and Pig integration. To learn about Pig Latin, look at
the Pig documentation. Reading Apache Pig's official tutorial ( http://pig.apache.org/docs/
r0.11.1/start.html#tutorial ) is recommended to know more about it.
You may also want to use CQL with Pig. You will have to use CqlStorage (with some
versions, CqlStorage may not work so try using CqlNativeStorage ), a word
count example looks as follows:
grunt> alice = LOAD 'cql://hadoop_test/lines' USING
CqlStorage();
grunt> B = foreach alice generate
flatten(TOKENIZE((chararray)$0)) as word;
grunt> C = group B by word;
grunt> D = foreach C generate COUNT(B) as word_count, group
as word;
grunt> E = FOREACH D GENERATE
TOTUPLE(TOTUPLE('word',word)),TOTUPLE('word_count',
word_count);
grunt> STORE E INTO 'cql://hadoop_test/
output?output_query=UPDATE%20hadoop_test.output%20SET%20word_count%20%3D%20%3F'
USING CqlStorage();
Search WWH ::




Custom Search