Best Practices - Mastering DynamoDB

Database Reference

In-Depth Information

Global secondary index best practices

Global secondary indexes allow us to create alternate hash and range keys on non-primary

key attributes. Querying is made quite an easy task with secondary indexes. There are vari-

ous best practices one should follow while using global secondary indexes. We are going to

discuss all such best practices in this section.

As we keep saying, it is very important for us to choose the correct hash and range keys at-

tributes, which would be distributing the load evenly across the partitions. We need to

choose the attributes having a variety of values as hash and range keys. Consider an ex-

ample of a student table where we have columns such as roll number, name, grade, and

marks. Here, the grade column would have values like A, B, C, and D, while the marks

column would have marks obtained by a particular student. Here, we have seen that the

grades column has a very limited number of unique values. So, if we create an index on

this column, then most of the values would get stored on only a limited number of nodes,

which is not a good way to access indexes. Instead, if you put an index on the marks

column, the variety of data from that column would help to distribute data evenly across

the cluster and hence improve the query performance.

Many people these days use global secondary indexes for quick lookups. Many times, we

would have a huge table with large number of attributes attached to each item. Querying

such a table is quite an expensive transaction. Also, we might not always need all the attrib-

utes given in a certain table. In that case we can create a global secondary index on primary

key attributes, adding only required attributes a projected attributes. This technique helps in

providing quick lookups with less provisioned throughput, ultimately helping to reduce

cost.

We can also create global secondary indexes to store duplicate table data. Here, we can cre-

ate an index similar to table schema and direct all queries on index instead of table. So, if

we are expecting heavy read/write operations on the table, then regular queries can be dir-

ected to indexes. This would allow us to keep the provisioned throughput constant for the

table and also avoid a sudden burst, keeping all table transactions intact.

We should make a note that Global Secondary Indexes ( GSI ) are eventually consistent,

which means an under-provisioned GSI can have a huge impact on table write throughput,

and it may also lead us to exceed the provisioned throughput. To know more about this,

thread.

Search WWH ::

Custom Search

Home