Database Reference
In-Depth Information
Cassandra is highly available. You can replace failed nodes in the cluster with no downtime, and
you can replicate data to multiple data centers to offer improved local performance and prevent
downtime if one data center experiences a catastrophe such as fire or flood.
Tuneable Consistency
Consistencyessentially means that a read always returns the most recently written value. Con-
sider two customers are attempting to put the same item into their shopping carts on an ecom-
merce site. If I place the last item in stock into my cart an instant after you do, you should get
the item added to your cart, and I should be informed that the item is no longer available for
purchase. This is guaranteed to happen when the state of a write is consistent among all nodes
that have that data.
But there's no free lunch, and as we'll see later, scaling data stores means making certain trade-
offs between data consistency, node availability, and partition tolerance. Cassandra is frequently
called “eventually consistent,” which is a bit misleading. Out of the box, Cassandra trades some
consistency in order to achieve total availability. But Cassandra is more accurately termed “tune-
ably consistent,” which means it allows you to easily decide the level of consistency you require,
in balance with the level of availability.
Let's take a moment to unpack this, as the term “eventual consistency” has caused some uproar
in the industry. Some practitioners hesitate to use a system that is described as “eventually con-
sistent.”
For detractors of eventual consistency, the broad argument goes something like this: eventual
consistency is maybe OK for social web applications where data doesn't reallymatter. After all,
you're just posting to mom what little Billy ate for breakfast, and if it gets lost, it doesn't really
matter. But the data Ihave is actually really important, and it's ridiculous to think that I could
allow eventual consistency in my model.
Set aside the fact that all of the most popular web applications (Amazon, Facebook, Google,
Twitter) are using this model, and that perhaps there's something to it. Presumably such data
is very important indeed to the companies running these applications, because that data is their
primary product, and they are multibillion-dollar companies with billions of users to satisfy in a
sharply competitive world. It may be possible to gain guaranteed, immediate, and perfect con-
sistency throughout a highly trafficked system running in parallel on a variety of networks, but if
you want clients to get their results sometime this year, it's a very tricky proposition.
The detractors claim that some Big Data databases such as Cassandra have merely eventual con-
sistency, and that all other distributed systems have strictconsistency. As with so many things
in the world, however, the reality is not so black and white, and the binary opposition between
Search WWH ::




Custom Search