Database Reference
In-Depth Information
10.2
Data Replication and Consistency Management
In general, stateless services are easy to scale since any new replicas of these
services can operate completely independently of other instances. In contrast,
scaling stateful services, such as a database system , needs to guarantee a consistent
view of the system for users of the service. However, the cost of maintaining several
database replicas that are always strongly consistent is very high. As we have
previously described, according to the CAP theorem, most of the NoSQL systems
overcome the difficulties of distributed replication by relaxing the consistency guar-
antees of the system and supporting various forms of weaker consistency models
(e.g. eventual consistency [ 226 ]). In practice, a common feature of the NoSQL
and DaaS cloud offerings is the creation and management of multiple replicas
(usually 3) of the stored data while a replication architecture is running behind-the-
scenes to enable automatic failover management and ensure high availability of the
service. In general, replicating for performance differs significantly from replicating
for availability or fault tolerance. The distinction between the two situations is
mainly reflected by the higher degree of replication, and as a consequence the
need for supporting weak consistency when scalability is the motivating factor for
replication [ 95 ].
Several studies have been presented as an attempt to quantify the consistency
guarantees of cloud storage services. Wada et al. [ 228 ] presented an approach for
measuring time-based staleness by writing timestamps to a key from one client,
reading the same key and computing the difference between the reader's local time
and the timestamp read. Bermbach and Tai [ 78 ] have tried to address a side of
these limitations by extending original the experiments of [ 228 ] using a number
of readers which are geographically distributed. They measure the consistency
window by calculating the difference between the latest read timestamp of version
n and the write timestamp of version n C 1. Their experiments with Amazon S3
showed that the system frequently violates monotonic read consistency. Anderson
et al. [ 65 ] presented an offline algorithm that analyzes the trace of interactions
between the client machines and the underlying key-value store, and reports how
many violations for consistent reads are there in the trace. This approach is useful
for checking the safety of running operations and detecting any violation on the
semantics of the executed operations. However, it is not useful for any system
that require online monitoring for their data staleness or consistency grantees.
Zellag and Kemme [ 237 ] have proposed an approach for real-time detection of
consistency anomalies for arbitrary cloud applications accessing various types of
cloud datastores in transactional or non-transactional contexts. In particular, the
approach builds the dependency graph during the execution of a cloud application
and detect cycles in the graph at the application layer and independently of the
underlying datastore. Bailis et al. [ 71 ] presented an approach that provides expected
bounds on staleness by predicting the behavior of eventually consistent quorum-
replicated data stores using Monte Carlo simulations and an abstract model of the
storage system including details such as the distribution of latencies for network
links.
Search WWH ::




Custom Search