Database Reference
In-Depth Information
Distributing the load by choosing the correct key
In the case of a multipartition table, the data as well as the table's associated indexes will be
distributed across servers. This distribution of indexes across servers is determined by the
value of one of the attributes of the index. Yes, you're correct: it is decided by the hash key
value. Unlike the hash key value (compounded with the range key) of a table, indexes keys
can be duplicated . In the case of local secondary indexes, this problem will not occur be-
cause of the fact that the table's hash key is the same as that of the index, so the index will
be distributed similar to the table. Therefore, this practice must be kept in mind while
designing a global secondary index.
For example, let's take a look at the global secondary index Idx_Pub_Edtn created by us
at the start of this chapter. In this index, we set the Language attribute as the hash key
and the Edition attribute as the range key, so the index will be distributed based on the
value of the Language attribute.
If we assume that our table has topics written only in four different languages (English,
German, Latin, and Greek), each server or partition (assuming that four partitions are cre-
ated for this table) holds details about the topic written in each language. As we are aware
that most of the topics will be written in English, the number of topics written in other lan-
guages will be fewer, so, the server storing topics in the English language will perform (and
handle) too many read and write requests compared to other languages. This is called
skewed data. Therefore, we need to avoid these kinds of indexes that will make our retriev-
al slower. Can we make the Edition attribute as the hash key and the Language attrib-
ute as the range key of the global secondary index? I will give you a hint about the table.
As we all know, most of the topics will not make it to the second edition.
Search WWH ::




Custom Search