An Overview of the NoSQL World - Large Scale and Big Data: Processing and Management

Database Reference

In-Depth Information

higher degree of replication, and as a consequence the need for supporting weak

consistency when scalability is the motivating factor for replication [19].

Several studies have been presented as an attempt to quantify the consistency

guarantees of cloud storage services. Wada et al. [64] presented an approach for

measuring time-based staleness by writing timestamps to a key from one client,

reading the same key, and computing the difference between the reader's local time

and the timestamp read. Bermbach and Tai [11] have tried to address a side of these

limitations by extending original the experiments of [64] using a number of readers

that are geographically distributed. They measure the consistency window by cal-

culating the difference between the latest read timestamp of version n and the write

timestamp of version n +1. Their experiments with Amazon S3 showed that the sys-

tem frequently violates monotonic read consistency. Anderson et al. [4] presented an

offline algorithm that analyzes the trace of interactions between the client machines

and the underlying key-value store and reports how many violations for consistent

reads are there in the trace. This approach is useful for checking the safety of run-

ning operations and detecting any violation on the semantics of the executed opera-

tions. However, it is not useful for any system that requires online monitoring for

their data staleness or consistency grantees. Zellag and Kemme [67] have proposed

an approach for real-time detection of consistency anomalies for arbitrary cloud

applications accessing various types of cloud datastores in transactional or nontrans-

actional contexts. In particular, the approach builds the dependency graph during

the execution of a cloud application and detect cycles in the graph at the application

layer and independently of the underlying datastore. Bailis et al. [8] presented an

approach that provides expected bounds on staleness by predicting the behavior of

eventually consistent quorum-replicated data stores using Monte Carlo simulations

and an abstract model of the storage system including details such as the distribution

of latencies for network links.

Kraska et al. [42] have argued that finding the right balance among cost, consis-

tency, and availability is not a trivial task. High consistency implies high cost per

transaction and, in some situations, reduced availability but avoids penalty costs.

Low consistency leads to lower costs per operation but might result in higher penalty

costs. Hence, they presented a mechanism that not only allows designers to define

the consistency guarantees on the data instead at the transaction level, but also allows

them to automatically switch consistency guarantees at runtime. They described a

dynamic consistency strategy, called Consistency Rationing , to reduce the consis-

tency requirements when possible (i.e., the penalty cost is low) and raise them when

it matters (i.e., the penalty costs would be too high). The adaptation is driven by a

cost model and different strategies that dictate how the system should behave. In

particular, they divide the data items into three categories ( A , B , C ) and treat each

category differently depending on the consistency level provided. The A category

represents data items for which we need to ensure strong consistency guarantees as

any consistency violation would result in large penalty costs, the C category repre-

sents data items that can be treated using session consistency as temporary incon-

sistency is acceptable, while the B category comprises all the data items where the

consistency requirements vary over time depending on the actual availability of an

item. Therefore, the data of this category is handled with either strong or session

Large Scale and Big Data: Processing and Management

Search WWH ::

Custom Search

Home