Database Reference
In-Depth Information
The Dynamo system [30] is a highly available and scalable distributed key/value-
based datastore built for supporting internal Amazon's applications. Dynamo is used
to manage the state of services that have very high reliability requirements and need
tight control over the tradeoffs among availability, consistency, cost-effectiveness,
and performance. There are many services on Amazons platform that only need
primary-key access to a data store. The common pattern of using a relational data-
base would lead to inefficiencies and limit the ability to scale and provide high avail-
ability. Thus, Dynamo provides a simple primary-key-only interface to meet the
requirements of these applications. The query model of the Dynamo system relies on
simple read and write operations to a data item that is uniquely identified by a key.
State is stored as binary objects (blobs) identified by unique keys. No operations span
multiple data items.
Dynamo's partitioning scheme relies on a variant of consistent hashing mecha-
nisms [39] to distribute the load across multiple storage hosts. In this mechanism,
the output range of a hash function is treated as a fixed circular space or ring (i.e.,
the largest hash value wraps around to the smallest hash value). Each node in the
system is assigned a random value within this space, which represents its position
on the ring. Each data item identified by a key is assigned to a node by hashing the
data item's key to yield its position on the ring, and then walking the ring clock-
wise to find the first node with a position larger than the item's position. Thus, each
node becomes responsible for the region in the ring between it and its predecessor
node on the ring. The principle advantage of consistent hashing is that departure
or arrival of a node only affects its immediate neighbors and other nodes remain
unaffected.
In the Dynamo system, each data item is replicated at N hosts where N is a param-
eter configured per-instance. Each key k is assigned to a coordinator node. The coor-
dinator is in charge of the replication of the data items that fall within its range. In
addition to locally storing each key within its range, the coordinator replicates these
keys at the ( N − 1) clockwise successor nodes in the ring. This results in a system
where each node is responsible for the region of the ring between it and its N th pre-
decessor. As illustrated in Figure 9.4, node B replicates the key k at nodes C and D
Key K
A
G
B
Nodes B, C,
and D store
keys in
range (A, B)
including K.
F
C
E
D
FIGURE 9.4 Partitioning and replication of keys in the Dynamo ring. (From G. DeCandia
et al., Dynamo: Amazon's highly available key-value store, in SOSP , pp. 205-220, 2007.)
Search WWH ::




Custom Search