Database Reference
In-Depth Information
￿
Supporting weaker consistency models in contrast to ACID guaranteed properties
for transactions in most traditional RDBMS. These models are usually referred
to as BASE models ( B asically A vailable, S oft state, E ventually consistent) [ 196 ].
￿
Efficient use of distributed indexes and RAM for data storage.
￿
The ability to dynamically define new attributes or data schema.
These design features are made in order to achieve the following system goals:
￿
Availability : They must always be accessible even on the situations of having a
network failure or a whole datacenter is went offline.
￿
Scalability : They must be able to support very large databases with very high
request rates at very low latency.
￿
Elasticity : They must be able to satisfy changing application requirements in both
directions (scaling up or scaling down). Moreover, the system must be able to
gracefully respond to these changing requirements and quickly recover its steady
state.
￿
Load Balancing : They must be able to automatically move load between servers
so that most of the hardware resources are effectively utilized and to avoid any
resource overloading situations.
￿
Fault Tolerance : They must be able to deal with the situation that the rarest
hardware problems go from being freak events to eventualities. While hardware
failure is still a serious concern, this concern needs to be addressed at the archi-
tectural level of the database, rather than requiring developers, administrators and
operations staff to build their own redundant solutions.
￿
Ability to run in a heterogeneous environment : On scaling out environment, there
is a strong trend towards increasing the number of nodes that participate in
query execution. It is nearly impossible to get homogeneous performance across
hundreds or thousands of compute nodes. Part failures that do not cause complete
node failure, but result in degraded hardware performance become more common
at scale. Hence, the system should be designed to run in a heterogeneous envi-
ronment and must take appropriate measures to prevent performance degradation
that are due to parallel processing on distributed nodes.
In the following subsections, we provide an overview of the main NoSQL
systems which has been introduced and used internally by three of the key players
in the Web scale data management domain: Google , Yahoo and Amazon .
Google: Bigtable
Bigtable is a distributed storage system for managing structured data that is designed
to scale to a very large size (petabytes of data) across thousands of commodity
servers [ 99 ]. It has been used by more than sixty Google products and projects
such as: Google search engine, Google Finance, Orkut, Google Docs and Google
Earth. These products use Bigtable for a variety of demanding workloads which
Search WWH ::




Custom Search