Database Reference
In-Depth Information
the corresponding server. The in-memory data storage is implemented using a dis-
tributed memory object caching system called Memcache , * while the on-disk data
storage is implemented as a HDFS file residing in the Hadoop data node server.
The HyperTable project is designed to achieve a high performance, scalable, dis-
tributed storage, and processing system for structured and unstructured data. It is
designed to manage the storage and processing of information on a large cluster of
commodity servers, providing resilience to machine and component failures. Like
HBase, Hypertable also runs over HDFS to leverage the automatic data replication,
and fault tolerance that it provides. In HyperTable, data is represented in the system
as a multidimensional table of information. The HyperTable systems provides a low-
level API and Hypertable Query Language (HQL) that provides the ability to create,
modify, and query the underlying tables. The data in a table can be transformed and
organized at high speed by performing computations in parallel, pushing them to
where the data is physically stored.
CouchDB is a document-oriented database that is written in Erlang and can be
queried and indexed in a MapReduce fashion using JavaScript. In CouchDB, docu-
ments are the primary unit of data. A CouchDB document is an object that consists
of named fields. Field values may be strings, numbers, dates, or even ordered lists
and associative maps. Hence, a CouchDB database is a flat collection of documents
where each document is identified by a unique ID. CouchDB provides a RESTful
HTTP API for reading and updating (add, edit, delete) database documents. The
CouchDB document update model is lockless and optimistic. Document edits are
made by client applications. If another client was editing the same document at the
same time, the client gets an edit conflict error on save. To resolve the update con-
flict, the latest document version can be opened, the edits reapplied, and the update
retried again. Document updates are all or nothing, either succeeding entirely or
failing completely. The database never contains partially saved or edited documents.
MongoDB § is another example of distributed schema-free document-oriented
database, which is created at 10gen. It is implemented in C++ but provides drivers
for a number of programming languages including C, C++, Erlang. Haskell, Java,
JavaScript, Perl, PHP, Python, Ruby, and Scala. It also provides a JavaScript
command-line interface. MongoDB stores documents as BSON (Binary JSON),
which are binary encoded JSON like objects. BSON supports nested object struc-
tures with embedded objects and arrays. At the heart of MongoDB is the concept
of a document that is represented as an ordered set of keys with associated values.
A collection is a group of documents. If a document is the MongoDB analog of a
row in a relational database, then a collection can be thought of as the analog to a
table. Collections are schema-free. This means that the documents within a single
collection can have any number of different shapes. MongoDB groups collections
into databases . A single instance of MongoDB can host several databases, each of
which can be thought of as completely independent. It provides eventual consistency
* http://memcached.org/.
http://hypertable.org/.
http://couchdb.apache.org/.
§ http://www.mongodb.org/.
http://www.10gen.com/.
Search WWH ::




Custom Search