An Overview of the NoSQL World - Large Scale and Big Data: Processing and Management

Database Reference

In-Depth Information

They can be seen as a namespace for ColumnFamilies and is typically allocated

as one per application. SuperColumns represent columns that themselves have

subcolumns (e.g., Maps). Like Dynamo, Cassandra provides a tunable consistency

model that allows the ability to choose the consistency level that is suitable for a

specific application. For example, it allows to choose how many acknowledgments

are required to be received from different replicas before considering a WRITE

operation to be successful. Similarly, the application can choose how many suc-

cessful responses need to be received in the case of READ before returning the

result to the client. In particular, every write operation can choose one of the fol-

lowing consistency levels:

a. ZERO : It ensures nothing. The write operation will be executed asynchro-

nously in the system background.

b. ANY : It ensures that the write operation has been executed in at least one

node.

c. ONE : It ensures that the write operation has been committed to at least 1

replica before responding to the client.

d. QUORUM : It ensures that the write has been executed on ( N /2 + 1) repli-

cas before responding to the client where N is the total number of system

replicas.

e. ALL : It ensures that the write operation has been committed to all N repli-

cas before responding to the client.

On the other hand, every read operation can choose one of the following available

consistency levels:

a. ONE : It will return the record of the first responding replica.

b. QUORUM : It will query all replicas and return the record with the most

recent timestamp once it has at least a majority of replicas ( N /2 + 1) reported.

c. ALL : It will query all replicas and return the record with the most recent

timestamp once all replicas have replied.

Therefore, any unresponsive replicas will fail the read operation. For read opera-

tions, in the ONE and QUORUM consistency levels, a consistency check is always

done with the remaining replicas in the system background to fix any consistency

issues.

HBase* is another project is based on the ideas of Bigtable system. It uses the

Hadoop distributed filesystem (HDFS) † as its data storage engine. The advantage

of this approach is that HBase does not need to worry about data replication, data

consistency, and resiliency because HDFS already considers and deals with them.

However, the downside is that it becomes constrained by the characteristics of HDFS,

which is that it is not optimized for random read access. In the HBase architecture,

data is stored in a farm of Region Servers. A key-to-server mapping is used to locate

* http://hbase.apache.org/.

† http://hadoop.apache.org/hdfs/.

Search WWH ::

Custom Search

Home