Conclusions - Cloud Data Management

Database Reference

In-Depth Information

10.2

Data Replication and Consistency Management

In general, stateless services are easy to scale since any new replicas of these

services can operate completely independently of other instances. In contrast,

scaling stateful services, such as a database system , needs to guarantee a consistent

view of the system for users of the service. However, the cost of maintaining several

database replicas that are always strongly consistent is very high. As we have

previously described, according to the CAP theorem, most of the NoSQL systems

overcome the difficulties of distributed replication by relaxing the consistency guar-

antees of the system and supporting various forms of weaker consistency models

(e.g. eventual consistency [ 226 ]). In practice, a common feature of the NoSQL

and DaaS cloud offerings is the creation and management of multiple replicas

(usually 3) of the stored data while a replication architecture is running behind-the-

scenes to enable automatic failover management and ensure high availability of the

service. In general, replicating for performance differs significantly from replicating

for availability or fault tolerance. The distinction between the two situations is

mainly reflected by the higher degree of replication, and as a consequence the

need for supporting weak consistency when scalability is the motivating factor for

replication [ 95 ].

Several studies have been presented as an attempt to quantify the consistency

guarantees of cloud storage services. Wada et al. [ 228 ] presented an approach for

measuring time-based staleness by writing timestamps to a key from one client,

reading the same key and computing the difference between the reader's local time

and the timestamp read. Bermbach and Tai [ 78 ] have tried to address a side of

these limitations by extending original the experiments of [ 228 ] using a number

of readers which are geographically distributed. They measure the consistency

window by calculating the difference between the latest read timestamp of version

n and the write timestamp of version n C 1. Their experiments with Amazon S3

showed that the system frequently violates monotonic read consistency. Anderson

et al. [ 65 ] presented an offline algorithm that analyzes the trace of interactions

between the client machines and the underlying key-value store, and reports how

many violations for consistent reads are there in the trace. This approach is useful

for checking the safety of running operations and detecting any violation on the

semantics of the executed operations. However, it is not useful for any system

that require online monitoring for their data staleness or consistency grantees.

Zellag and Kemme [ 237 ] have proposed an approach for real-time detection of

consistency anomalies for arbitrary cloud applications accessing various types of

cloud datastores in transactional or non-transactional contexts. In particular, the

approach builds the dependency graph during the execution of a cloud application

and detect cycles in the graph at the application layer and independently of the

underlying datastore. Bailis et al. [ 71 ] presented an approach that provides expected

bounds on staleness by predicting the behavior of eventually consistent quorum-

replicated data stores using Monte Carlo simulations and an abstract model of the

storage system including details such as the distribution of latencies for network

links.

Cloud Data Management

Search WWH ::

Custom Search

Home