Best Practices - Mastering DynamoDB

Database Reference

In-Depth Information

Managing time series data

Many times we have a requirement to store time series data in our database. We might be

saving data in that table over years and the table size would keep growing. Consider the ex-

ample of an order table where you would be saving orders made my customers. You can

choose the order ID as the hash key and the date/time as the range. This strategy would cer-

tainly segregate the data, and you would be able to query data on order ID with date/time

easily, but there is a problem with this approach as here there is a good chance recent data

will be accessed more frequently than older data.

So, here we might end up creating some partitions as hot partitions, while others would be

cold partitions. To solve this problem, it is recommended to create tables based on time

range, which means creating a new table for each week or month instead of saving all data

in the table. This strategy helps avoid the creation of any hot or cold partitions. You can

simply query data for a particular time range table itself. This strategy also helps when you

need to purge data where you can simply drop the tables you don't wish to see any more.

Alternatively, you can simply dump that data on AWS S3, as flat files, which is a cheap

data storage service from Amazon.

We are going to see how to integrate AWS S3 with DynamoDB in Chapter 6 , Integrating

DynamoDB with Other AWS Components .

Search WWH ::

Custom Search

Home