Information Technology Reference
In-Depth Information
does support this, it requires knowledge about the data distribution in order
to properly adjust the ranges.
21.4 NoSQL Databases
NoSQL databases have been classified into four subcategories:
1. Column family stores : An extension of the key-value architecture with
columns and column families; the overall goal was to process dis-
tributed data over a pool of infrastructure, for example, HBase and
Cassandra.
2. Key-value pairs : This model is implemented using a hash table where
there is a unique key and a pointer to a particular item of data creat-
ing a key-value pair, for example, Voldemort.
3. Document databases : This class of databases is modeled after Lotus
Notes and similar to key-value stores. The data are stored as a docu-
ment and is represented in JSON or XML formats. The biggest design
feature is the flexibility to list multiple levels of key-value pairs, for
example, Riak and CouchDB.
4. Graph databases : Based on the graph theory, this class of database
supports the scalability across a cluster of machines. The complexity
of representation for extremely complex sets of documents is evolv-
ing, for example, Neo4J.
21.4.1 Column-Oriented Stores or Databases
Hadoop HBase is the distributed database that supports the storage needs of
the Hadoop distributed programming platform. HBase is designed by taking
inspiration from Google BigTable; its main goal is to offer real-time read/
write operations for tables with billions of rows and millions of columns by
leveraging clusters of commodity hardware. The internal architecture and
logic model of HBase is very similar to Google BigTable, and the entire sys-
tem is backed by the Hadoop Distributed File System (HDFS), which mimics
the structure and services of GFS.
21.4.2 Key-Value Stores (K-V Store) or Databases
Apache Cassandra is a distributed object store from an aging large amounts
of structured data spread across many commodity servers. The system
is designed to avoid a single point of failure and offer a highly reliable
service. Cassandra was initially developed by Facebook; now, it is part
of the Apache incubator initiative. Facebook in the initial years had used
Search WWH ::




Custom Search