Database Reference
In-Depth Information
BigTable data; (3) look up Tablet server; (4) conduct error recovery in case of Table
server failures; (5) store BigTable schema information; (6) store the access control
table.
Every procedure executed by BigTable includes three main components: Master
server, Tablet server, and client library. BigTable only allows one set of Master
server be distributed to be responsible for distributing tablets for Tablet server,
detecting added or removed Tablet servers, and conducting load balance. In addition,
it can also modify the BigTable schema, e.g., creating tables and column families,
and collecting garbage saved in GFS as well as deleted or disabled files, and using
them in specific BigTable instances. Every tablet server manages a Tablet set and is
responsible for the processing of loaded Tablet reading and writing, and segmenting
Tablets when they are too big. The companying application client library is used to
communicate with BigTable instances.
Cassandra
Cassandra is a distributed storage system to manage the huge amount of structured
data distributed among multiple commercial servers [ 12 ]. The system was developed
by Facebook and became an open source tool in 2008. It adopts the ideas and
concepts of both Amazon Dynamo and Google BigTable, especially integrating the
distributed system technology of Dynamo with the BigTable data model. Tables
in Cassandra are in the form of distributed four-dimensional structured mapping,
where the four dimensions including row, column family, column, and super
column. A row is distinguished by a string-key with arbitrary length. No matter
what the amount of columns to be read or written is, the operation on rows is an
atomic operation. Columns may constitute clusters, which is called column families,
which are similar to the data model of BigTable. Cassandra provides two kinds of
column families: column families and super columns. The super column includes
any quantity of columns with names related to the super column. A column family
includes columns and super columns, which may be continuously added to the
column family during execution. The partition and copy mechanisms of Cassandra
are very similar to those of Dynamo, so as to achieve consistency.
Derivative Tools of BigTable
Since the BigTable code cannot be obtained through the open source license, some
open source projects compete to implement the BigTable concept to develop similar
systems, such as HBase and Hypertable.
HBase is a BigTable clone programmed with Java and is a part of Hadoop of
Apache's MapReduce framework [ 13 ]. HBase replaces GFS with HDFS. It writes
updated contents into the RAM and regularly writes them into files in discs. The row
operations are atomic operations, equipped with row-level locking and transaction
Search WWH ::




Custom Search