Database Reference
In-Depth Information
• We are specifying the column names for the max and min fields ( f:max and
f:min , respectively). Note that we have to specify the column family ( f: )
to qualify the columns.
Before running this script, we need to create an HBase table called sensors . We
can do this from the HBase shell, as follows:
$ hbase shell
$ create 'sensors' , 'f'
$ quit
Then, run the Pig script as follows:
$ pig hdfs-to-hbase.pig
Now watch the console output. Pig will execute the script as a MapReduce job.
Even though we are only importing two small files here, we can insert a fairly
large amount of data by exploiting the parallelism of MapReduce.
At the end of the run, Pig will print out some statistics:
Input(s):
Successfully read 7 records (591 bytes) from: "hdfs://quickstart.
cloudera:8020/user/cloudera/hbase-import"
Output(s):
Successfully stored 7 records in: "hbase://sensors"
Looks good! We should have seven rows in our HBase sensors table. We can
inspect the table from the HBase shell with the following commands:
$ hbase shell
$ scan 'sensors'
This is how your output might look:
ROW COLUMN+CELL
sensor11 column=f:max, timestamp=1412373703149,
value=90
sensor11 column=f:min, timestamp=1412373703149,
value=70
sensor22 column=f:max, timestamp=1412373703177,
value=80
sensor22 column=f:min, timestamp=1412373703177,
value=70
 
Search WWH ::




Custom Search