Database Reference
In-Depth Information
Consider an example: your database might include Customers and Orders. In a relational data-
base, referential keys must be defined in the database to allow you to join these tables and see,
for example, all the orders a particular customer has placed. Although you could do this in a key-
value store, typically you don't define any relationships in the data model itself; your application
is responsible for maintaining data integrity if, for example, you decide to delete a customer re-
cord.
One criticism of key-value stores is that they are terrific if you need to scale to billions of records,
but that this use case is a concern only for very large, social-based web properties. The sugges-
tion is that key-value stores mean by definition that your application will see the database as a
single, enormous, globally accessible hashtable, which is difficult to maintain and hard on pro-
grammer productivity.
There are many key-value stores in the wild today, including Tokyo Cabinet, Amazon's Sim-
pleDB, and Microsoft's Dynomite.
Amazon Dynamo
Dynamo is Amazon's proprietary key-value storage system. Though it's not usable by developers,
it's still important to discuss because it, along with Google Bigtable, inspired many of the design
decisions in Apache Cassandra.
In October of 2007, Werner Vogels, CTO of Amazon, published a white paper for the Associ-
ation of Computing Machinery (ACM) called “Dynamo: Amazon's Highly Available Key-value
Store.” This paper continues to be publicly available on his blog “All Things Distributed” at ht-
tp://www.allthingsdistributed.com/iles/amazon-dynamo-sosp2007.pdf . The paper is rather tech-
nical, but it is clear, concise, and very well written. I will just summarize the main points here.
Dynamo was born, as were many of the systems described in this chapter, from the need to honor
strict requirements for realizing high performance under continuous growth, meeting service-
level agreements (SLAs), remaining available under strenuous load and failures, gracefully hand-
ling those failures, and allowing horizontal scale. Therefore, with respect to the CAP theorem,
Dynamo, like Cassandra, is highly available and eventually consistent. Failure handling in both
of these systems is regarded as a “normal case without impacting availability or performance.”
This is achievable because of the trade-off Dynamo makes with consistency.
Dynamo is used for Amazon's shopping cart, and of course consistency is important to Amazon.
For a service such as a web-based shopping cart, which does not have competing readers, it is
more than worth the trade-offs and will not be problematic. Although consistency is not the main
focus of this system, it is a “tuneable” property, such that “eventual” is perhaps a misnomer.
As in Cassandra, consistency in Dynamo works where a configurable property allows the user
to decide what number of replicas must successfully respond before it can be determined that an
Search WWH ::




Custom Search