How DynamoDB Works - Mastering DynamoDB

Database Reference

In-Depth Information

Load balancing

DynamoDB, being a distributed system, needs its data to be balanced across various nodes.

It uses consistent hashing for distributing data across the nodes. Consistent hashing dy-

namically partitions the data over the network and keeps the system balanced.

Consistent hashing is a classic solution to a very complex problem. The secret is finding a

node in a distributed cluster to store and retrieve a value identified by a key, while at the

same time being able to handle the node failures. You would say this is quite easy, as you

can simply number the servers and use some modulo hash function to distribute the load.

But the real problem is not only finding the node for storage or retrieval but handling re-

quests if a certain node goes down or is unreachable. At this point, you would be left with

only one option, that is, to rearrange your hash numbering and rebalance the data. But do-

ing this on each failure is a quite an expensive operation.

Consistent hashing uses a unique technique to solve this problem; here, both the nodes and

the keys are hashed and the same technique is used for their lookup. We first create a hash

space or hash ring. Determining a hash value for each node in a cluster is as good as pla-

cing that node on the circumference of the hash ring, as shown in the following diagram:

Search WWH ::

Custom Search

Home