Database Reference
In-Depth Information
Figure1-2.Where different databases appear on the CAP continuum
In this depiction, relational databases are on the line between Consistency and Availability, which
means that they can fail in the event of a network failure (including a cable breaking). This is
typically achieved by defining a single master server, which could itself go down, or an array of
servers that simply don't have sufficient mechanisms built in to continue functioning in the case
of network partitions.
Graph databases such as Neo4J and the set of databases derived at least in part from the design
of Google's Bigtable database (such as MongoDB, HBase, Hypertable, and Redis) all are focused
slightly less on Availability and more on ensuring Consistency and Partition Tolerance.
NOTE
If you're interested in the properties of other Big Data or NoSQL databases, see this topic's Appendix A .
Finally, the databases derived from Amazon's Dynamo design include Cassandra, Project Vol-
demort, CouchDB, and Riak. These are more focused on Availability and Partition-Tolerance.
However, this does not mean that they dismiss Consistency as unimportant, any more than Bigt-
able dismisses Availability. According to the Bigtable paper, the average percentage of server
hours that “some data” was unavailable is 0.0047% (section 4), so this is relative, as we're talk-
ing about very robust systems already. If you think of each of these letters (C, A, P) as knobs
you can tune to arrive at the system you want, Dynamo derivatives are intended for employment
in the many use cases where “eventual consistency” is tolerable and where “eventual” is a matter
of milliseconds, read repairs mean that reads will return consistent values, and you can achieve
strong consistency if you want to.
Search WWH ::




Custom Search