Database Reference
In-Depth Information
Global secondary index best practices
Global secondary indexes allow us to create alternate hash and range keys on non-primary
key attributes. Querying is made quite an easy task with secondary indexes. There are vari-
ous best practices one should follow while using global secondary indexes. We are going to
discuss all such best practices in this section.
As we keep saying, it is very important for us to choose the correct hash and range keys at-
tributes, which would be distributing the load evenly across the partitions. We need to
choose the attributes having a variety of values as hash and range keys. Consider an ex-
ample of a student table where we have columns such as roll number, name, grade, and
marks. Here, the grade column would have values like A, B, C, and D, while the marks
column would have marks obtained by a particular student. Here, we have seen that the
grades column has a very limited number of unique values. So, if we create an index on
this column, then most of the values would get stored on only a limited number of nodes,
which is not a good way to access indexes. Instead, if you put an index on the marks
column, the variety of data from that column would help to distribute data evenly across
the cluster and hence improve the query performance.
Many people these days use global secondary indexes for quick lookups. Many times, we
would have a huge table with large number of attributes attached to each item. Querying
such a table is quite an expensive transaction. Also, we might not always need all the attrib-
utes given in a certain table. In that case we can create a global secondary index on primary
key attributes, adding only required attributes a projected attributes. This technique helps in
providing quick lookups with less provisioned throughput, ultimately helping to reduce
cost.
We can also create global secondary indexes to store duplicate table data. Here, we can cre-
ate an index similar to table schema and direct all queries on index instead of table. So, if
we are expecting heavy read/write operations on the table, then regular queries can be dir-
ected to indexes. This would allow us to keep the provisioned throughput constant for the
table and also avoid a sudden burst, keeping all table transactions intact.
We should make a note that Global Secondary Indexes ( GSI ) are eventually consistent,
which means an under-provisioned GSI can have a huge impact on table write throughput,
and it may also lead us to exceed the provisioned throughput. To know more about this,
you can go through the https://forums.aws.amazon.com/thread.jspa?threadID=143009
thread.
Search WWH ::




Custom Search