Database Reference
In-Depth Information
Using parallel scans
We have seen in Chapter 2 , Data Models , about the parallel scan and its advantages. Many
times, it is observed that the parallel scan is beneficial as compared with sequential scans.
The parallel scan is a good option for all such tables having huge data. We can easily break
the data into segments and perform the scan operation. Multiple workers can easily scan
the table at low priority, giving way to high priority, thus allowing critical application pro-
cesses to run smoother.
Even though parallel scans are beneficial, we need to keep in mind that they demand high
provisioned throughput. We also need to make sure that worker threads work in such a way
that they do not block any other critical processes. In order to help people like us to make
decisions about the parallel scan, Amazon has given some directions on when to use paral-
lel scans, which are as follows:
• If the table data size is more than 20 GB
• Table's provisioned throughput capacity is not fully utilized
• Sequential scans are too slow to get the task done
To set a reasonable value to the TotalSegments parameter, we need to perform the trial
and error method to get the best possible and most efficient way. We can simply start with
any number, check the performance, vary the provisioned throughput units, see how they
impact the overall performance and cost, and then decide what the perfect number of seg-
ments should be.
Search WWH ::




Custom Search