Database Reference
In-Depth Information
have been built to implement the concepts of these systems and make it available
for public users [ 94 , 205 ]. Due to the ease in which they can be downloaded and
installed, these systems have attracted a lot of interest from the research community.
There are not much details that have been published about the implementation of
most of these systems. In general, the NoSQL open source projects can be broadly
classified into the following categories:
￿
Key-value stores : These systems use the simplest data model which is a collection
of objects where each object has a unique key and a set of attribute/value pairs.
￿
Document stores : These systems have the data models that consists of objects
with a variable number of attributes with a possibility of having nested objects.
￿
Extensible record stores : They provide variable-width tables (Column Families)
that can be partitioned vertically and horizontally across multiple nodes.
Here, we give a brief introduction about some of these projects. For the full list,
we refer the reader to the NoSQL database website [ 34 ].
Cassandra [ 7 ] is presented as a highly scalable, eventually consistent, distributed,
structured key-value store [ 167 , 168 ]. It has been open sourced by Facebook in 2008.
It is designed by Avinash Lakshman (one of the authors of Amazon's Dynamo)
and Prashant Malik (Facebook Engineer). Cassandra brings together the distributed
systems technologies from Dynamo and the data model from Google's BigTable.
Like Dynamo, Cassandra is eventually consistent. Like BigTable, Cassandra pro-
vides a ColumnFamily-based data model richer than typical key/value systems.
In Cassandra's data model, column is the lowest/smallest increment of data. It is
a tuple (triplet) that contains a name, a value and a timestamp. A column family is
a container for columns, analogous to the table in a relational system. It contains
multiple columns, each of which has a name, value, and a timestamp, and are
referenced by row keys. A keyspace is the first dimension of the Cassandra hash, and
is the container for column families. Keyspaces are of roughly the same granularity
as a schema or database (i.e. a logical collection of tables) in RDBMS. They can
be seen as a namespace for ColumnFamilies and is typically allocated as one per
application. SuperColumns represent columns that themselves have subcolumns
(e.g. Maps). Like Dynamo, Cassandra provides a tunable consistency models which
allows the ability to choose the consistency level that is suitable for a specific
application. For example, it allows to choose how many acknowledgments are
required to be receive from different replicas before considering a WRITE operation
to be successful. Similarly, the application can choose how many successful
response need to be received in the case of READ before return the result to
the client. In particular, every write operation can choose one of the following
consistency level:
(a) ZERO : It ensures nothing. The write operation will be executed asynchronously
in the system background.
(b) ANY : It ensures that the write operation has been executed in at least one node.
(c) ONE : It ensures that the write operation has been committed to at least 1 replica
before responding to the client.
Search WWH ::




Custom Search