The Nonrelational Landscape - Cassandra: The Definitive Guide

Database Reference

In-Depth Information

operation was successful. To achieve this, communication between replicas is based on a peer-

to-peer (P2P) communication protocol called “gossip,” which we'll examine further in terms of

Cassandra inherits.

The requirements for the Dynamo architecture were clear. In order to support a highly available

model, the team decided to tune down the consistency “knob.” Again, this is perfectly acceptable

for their given use case. They also wanted a very easy-to-use query model, so the data is referen-

ced using unique keys and stored simply as byte arrays. This eliminates the need for any sophist-

icated schema design and allows Amazon to put effort toward low-latency and high-throughput

performance optimizations and their other primary goals.

To achieve an acceptable level of consistency, Dynamo must support some sort of versioning

mechanism so that replicas can know which node has the most recent (valid) copy of written

data. So it employs something called a vector clock, in which each process maintains a numeric

reference to the most recent event it's aware of. Another facet of the architecture that Cassandra

shares with Dynamo is the hinted handoff.

This section has summarized the basic points of the Amazon Dynamo paper in order to help you

understand its architectural goals and features. Although I very much encourage you to read the

Dynamo paper, be aware that Cassandra does diverge in its own ways, so don't take for granted

that something described there will necessarily apply to Cassandra. In short, Cassandra derives

its design around consistency and partition tolerance from Dynamo, and its data model is based

on Bigtable.

Project Voldemort

Voldemort was started as a project within LinkedIn when they encountered problems with simple

data partitioning to meet their scalability needs, similar to how Cassandra was started within

Facebook. Voldemort is a distributed, very simple key-value store, based on Amazon's Dynamo

and Memcached.

Performance numbers suggested by Jay Kreps of LinkedIn indicate approximately 20,000 reads

and 17,000 writes per second with one client and one server.

▪ Website : http://project-voldemort.com

▪ Orientation : Key-value store

▪ Created : Created in 2008 by LinkedIn's Data and Analytics team for application to real-

time problems

▪ Implementation language : Java

▪ Distributed : Yes

Search WWH ::

Custom Search

Home