Database Reference
In-Depth Information
You can git clone or download the zip ( https://github.com/jeromat-
ron/pygmalion/archive/master.zip ) and build it on your local box. Altern-
atively, you can find pygmalion-1.1.0-SNAPSHOT.jar in the jars folder
from the downloads for this topic.
1.
First let's create the keyspace and table in Cassandra as follows:
create keyspace twitter with replication =
{'class':'SimpleStrategy','replication_factor':1};
use twitter;
create table twitterdata(id timeuuid primary
key, screen_name text, tweetDate text, body
text);
2.
Load tweets using PigStorage :
tweets = LOAD '/home/vivek/tweets' USING
PigStorage('\ua001') as
(date:chararray,screen_name:chararray,body:chararray);
3.
Register jars with Pig registry:
register /home/vivek/Documents/apress_book/
Apress/uuid-3.2.jar;
register /home/vivek/Documents/apress_book/
Apress/hector-core-0.7.0-28.jar;
register /home/vivek/Documents/apress_book/
Apress/pygmalion-1.1.0-SNAPSHOT.jar;
4.
Define the Pig function:
define FromCassandraBag
org.pygmalion.udf.FromCassandraBag();
define ToCassandraBag
org.pygmalion.udf.ToCassandraBag();
define CqlStorage
org.apache.cassandra.hadoop.pig.CqlStorage();
define GenerateBinTimeUUID
org.pygmalion.udf.uuid.GenerateBinTimeUUID();
Search WWH ::




Custom Search