Database Reference
In-Depth Information
consistency depending on a statistical-based policy for decision making. Keeton et
al. [24] have proposed a similar approach in a system called LazyBase that allows
users to trade off query performance and result freshness. LazyBase breaks up meta-
data processing into a pipeline of ingestion, transformation, and query stages that
can be parallelized to improve performance and efficiency. By breaking up the pro-
cessing, LazyBase can independently determine how to schedule each stage for a
given set of metadata, thus providing more flexibility than existing monolithic solu-
tions. LazyBase uses models of transformation and query performance to determine
how to schedule transformation operations to meet users' freshness and performance
goals and to utilize resources efficiently.
In general, the simplicity of key-value stores comes at a price when higher levels
of consistency are required. In these cases, application programmers need to spend
extra time and exert extra effort to handle the requirements of their applications with
no guarantee that all corner cases are handled, which consequently might result in
an error-prone application. In practice, data replication across different data cen-
ters is expensive. Inter-datacenter communication is prone to variation in round-trip
times (RTTs) and loss of packets. For example, RTTs are in the order of hundreds of
milliseconds. Such large RTTs cause the communication overhead that dominates
the commit latencies observed by users. Therefore, systems often sacrifice strong
consistency guarantees to maintain acceptable response times. Hence, many solu-
tions rely on asynchronous replication mechanism and weaker consistency guar-
antees. Some systems have been recently proposed to tackle these challenges. For
example, Google Megastore [9] has been presented as a scalable and highly available
datastore that is designed to meet the storage requirements of large-scale interac-
tive Internet services. It relies on the Paxos protocol [20], a proven optimal fault-
tolerant consensus algorithm with no requirement for a distinguished master, for
achieving synchronous wide area replication. Megastore's replication mechanism
provides a single, consistent view of the data stored in its underlying database repli-
cas. Megastore replication semantics is done on entity group basis, a priori grouping
of data for fast operations, basis by synchronously replicating the group's transac-
tion log to a quorum of replicas. In particular, it uses a write-ahead log replication
mechanism over a group of symmetric peers where any node can initiate reads and
writes. Each log append blocks on acknowledgments from a majority of replicas and
replicas in the minority catch up as they are able. Kraska et al. [43] have proposed
the MDCC ( M ulti- D ata C enter C onsistency) commit protocol for providing strongly
consistent guarantees at a cost that is comparable to eventually consistent protocols.
In particular, in contrast to transactional consistency two-phase commit protocol
(2PC), MDCC is designed to commit transactions in a single round-trip across data
centers in the normal operational case. It also does not require a master node so
that apply reads or updates from any node in any data center by ensuring that every
commit has been received by a quorum of replicas. It does not also impose any
database partitioning requirements. The MDCC commit protocol can be combined
with different read guarantees where the default configuration is to guarantee read
committed consistency without any lost updates. In principle, we believe that the
problem of data replication and consistency management across different data cen-
ters in the cloud environment has, thus far, not attracted sufficient attention from the
Search WWH ::




Custom Search