Database Reference
In-Depth Information
within Google as the underlying data store, supporting more than 60 projects, including Gmail,
YouTube, Google Analytics, Google Finance, Orkut, Personalized Search, and Google Earth.
Bigtable runs on top of the Google File System (GFS).
It is useful to understand Bigtable, at least to a certain degree, because many of its attributes and
design decisions are explicitly copied in Cassandra. Although Cassandra gets its design for con-
sistency and partition tolerance from Amazon Dynamo, Cassandra's data model is based more
closely on Bigtable's. For example, Cassandra borrows from Bigtable (sometimes with modifica-
tion) the implementation of SSTables, memtables, Bloom filters, and compactions (see the Gloss-
ary for deinitions of these terms; they are explored in detail elsewhere in this topic as appro-
priate). In this way, Cassandra supports a somewhat richer data model than Dynamo, something
more flexible and layered than a simple key-value store, as it supports sparse, semistructured data.
NOTE
I very much encourage you to read the Google Bigtable paper; it's an excellent read. However, keep in
mind that although Cassandra borrows many key ideas from Bigtable, it is not generally a 1:1 corres-
pondence in ideas or implementation. For example, Bigtable defines master and slave nodes, and while
Cassandra's data model and storage mechanism are based on Bigtable and use the same terminology in
many places, it's not always the case. For example, Bigtable reads and writes are close but not identical
to their Cassandra implementations; Bigtable defines a Tablet structure that is not strictly present in Cas-
sandra; and so on. You can read the paper at http://labs.google.com/papers/bigtable.html .
Cassandra does contrast with Bigtable in several areas, however, not least of which is that Cas-
sandra maintains a decentralized model. In Bigtable there is a master server that controls opera-
tions using the Chubby persistent distributed locking mechanism; in Cassandra, all the nodes are
on even par with no centralized control, and they communicate using a gossip model.
Bigtable relies on a distributed lock service called Chubby for several different things: ensuring
that there is at most a single master replica at any given time; managing server bootstrapping,
discovery, and death; and storing the schema information.
Website : None, but you might be interested in a related project called Google Fusion Tables,
which is available at http://tables.googlelabs.com .
Orientation : Columnar
Created : By Google, Inc. Development started in 2004, with the paper published in 2006.
Implementation language : C++
Distributed : Yes
Search WWH ::




Custom Search