Integrating DynamoDB with Other AWS Components - Mastering DynamoDB

Database Reference

In-Depth Information

CREATE EXTERNAL TABLE packtPubEmployee (item map<string,

string>)

STORED BY

'org.apache.hadoop.hive.dynamodb.DynamoDBStorageHandler'

TBLPROPERTIES ("dynamodb.table.name" = "Employee");

INSERT OVERWRITE TABLE packtPubEmployee SELECT * FROM

packtPubEmployee_s3;

Here, instead of specifying attributes in the table, we are giving a single-map variable that

would store all the values, and which would add the data to corresponding attributes. But

as we don't have any attributes specified, we cannot query such tables in Hive as we

would not have attribute names with us.

Importing data from HDFS

We saw data export to HDFS from DynamoDB; in the same manner, you can import data

to DynamoDB from HDFS flat files. Here, first we need to create a table in hive that is

linked to a directory on HDFS. Then, we need to create another table that links to a table

in DynamoDB where you need to put the data. Now, you can simply insert data from the

first to the second table, and you would be able to see the data imported in the Dy-

namoDB table.

In the following example, we would try to import the data present on the HDFS path

/data/employee to the Employee table in DynamoDB:

CREATE EXTERNAL TABLE packtPubEmployee_hdfs(empid String,

yoj String, department String, salary bigint, ,manager

String)

ROW FORMAT DELIMITED FIELDS TERMINATED BY ','

LOCATION 'hdfs:///data/employee/';

CREATE EXTERNAL TABLE packtPubEmployee (empid String, yoj

String, department String, salary bigint, manager String)

STORED BY

'org.apache.hadoop.hive.dynamodb.DynamoDBStorageHandler'

TBLPROPERTIES ("dynamodb.table.name" = "Employee",

"dynamodb.column.mapping" =

"empid:empId,yoj:yoj,department:dept,salary:salary,manager:manager");

Search WWH ::

Custom Search

Home