DynamoDB with Redshift, Data Pipeline, and MapReduce - DynamoDB Applied Design Patterns

Database Reference

In-Depth Information

Importing data to DynamoDB

The write capacity should be greater than the number of mappers in the EMR cluster while

Hive puts data into DynamoDB. If the capacity is not more than the mappers, it may be that

the Hive operation will consume all the write throughput, or try to consume more through-

put than is provisioned. Perform the following steps:

1. To import a table from S3 to DynamoDB, you can use the following command:

CREATE EXTERNAL TABLE uchit_s3_import(aa_col string,

bb_col bigint, cc_col array<string>)

ROW FORMAT DELIMITED FIELDS TERMINATED BY ','

LOCATION 's3://bucketname/path/subpath/

';

CREATE EXTERNAL TABLE givenHiveTableName (col1 string,

col2 bigint, col3 array<string>)

STORED BY

'org.apache.hadoop.hive.dynamodb.DynamoDBStorageHandler'

TBLPROPERTIES ("dynamodb.table.name" = "uchit",

"dynamodb.column.mapping" =

"col1:givenname,col2:givenyear,col3:givendays");

INSERT OVERWRITE TABLE 'givenHiveTableName' SELECT *

FROM Uchit_s3_import;

2. To import a table from HDFS to DynamoDB, use the following command:

CREATE EXTERNAL TABLE Uchit_hdfs_import(aa_col string,

bb_col bigint, cc_col array<string>)

ROW FORMAT DELIMITED FIELDS TERMINATED BY ','

LOCATION 'hdfs:///directoryName';

CREATE EXTERNAL TABLE givenHiveTableName (col1 string,

col2 bigint, col3 array<string>)

STORED BY

'org.apache.hadoop.hive.dynamodb.DynamoDBStorageHandler'

Search WWH ::

Custom Search

Home