DynamoDB with Redshift, Data Pipeline, and MapReduce - DynamoDB Applied Design Patterns

Database Reference

In-Depth Information

Loading data from DynamoDB into

Redshift

Amazon Redshift will give great results by integrating with Amazon DynamoDB. Integra-

tion of both services will give you advanced business intelligence capabilities and a power-

ful SQL-based interface for database administrators and developers. By copying data from

an Amazon DynamoDB table to Amazon Redshift clustered instances, you can perform

any complex data analysis queries on the given data, including all joins. You can transfer

all of your Amazon DynamoDB data from tables into an Amazon Redshift table using just

a single command run from within Amazon Redshift. To load data into Redshift from Dy-

namoDB, you have to first create tables in Redshift. The table can be temporary or persist-

ent. The COPY command will affix new inputs as data to any existing rows in the table:

copytable_uchitredshift from 'dynamodb://

table_uchitdynamodb'

credentials

'aws_access_key_id=xxxxx;aws_secret_access_key=xxx'

readratio 50;

In this example, the source table in DynamoDB is table_uchitdynamodb . The target

table in Amazon Redshift is table_uchitredshift . The readratio 50 clause

regulates the percentage of provisioned throughput that is consumed. In this case, the COPY

command will use only 50 percent of the read capacity units provisioned for

table_uchitdynamodb . I recommend setting the ratio to a value less than the average

unused provisioned throughput because a lower value will minimize throttling issues.

Tip

For the COPY command, you must have the INSERT privilege on the Amazon Redshift

table.

Remember that you are transferring data from a NoSQL environment to a SQL environ-

ment; so there are certain rules in one environment that won't work in another environment.

The rules can be as follows:

• The DynamoDB table names are case sensitive, whereas in Amazon Redshift they

are not.

Search WWH ::

Custom Search

Home