Best Practices - Mastering DynamoDB

Database Reference

In-Depth Information

Maintaining even read activity

We know that a scan operation fetches 1 MB of data for a single request per page. We also

know that an eventually consistent read operation consumes two 4 KB read capacity units

per second. This means that a single scan operation costs (1 MB / 4 KB items / two eventu-

ally consistent reads) = 128 reads, which would be quite high if you have set your provi-

sioned throughput very low. This sudden burst of data would cause throttling of the provi-

sioned throughput for the given table. Also, meanwhile, if you get a very important request,

that request would get throttled after default retries.

Also, it has been observed that scan operations do try to consume capacity units from the

same partition that would cause the utilization of all available capacity units for the scan

operation only due to which any other request coming to the same partition would not get

served. To avoid this, we perform the following operations to maintain the even load distri-

bution for large scan operations.

• We can avoid a sudden burst of large capacity units by reducing the page size. The

scan and query operation support the LIMIT attribute, where you can specify the

number of items to be retrieved per request. By doing so, there would be some gap

between any two page requests, and if there is any other request waiting, then Dy-

namoDB would process that request in between.

• Keeping two tables for the same data is also a good strategy. Here, we can have

two tables with the same data, but each one is used for a different purpose. One can

be used to dedicatedly do high priority tasks and the other can be used to do quer-

ies and scans. So, if by chance, any scan operations get all provisioned throughput,

even then you have another table, which would always take care of high-priority or

application-critical requests.

But we have to keep in mind that any write operation on a table should also change the val-

ues in another table in order to keep in sync.

You should also consider implementing error retries and exponential back-offs so that even

if there are more requests coming than the provisioned throughput, all failed requests get

retried after an exponential time frame.

Search WWH ::

Custom Search

Home