Conclusions - Cloud Data Management

Database Reference

In-Depth Information

Kraska et al. [ 163 ] have argued that finding the right balance between cost,

consistency and availability is not a trivial task. High consistency implies high cost

per transaction and, in some situations, reduced availability but avoids penalty costs.

Low consistency leads to lower costs per operation but might result in higher penalty

costs. Hence, they presented a mechanism that not only allows designers to define

the consistency guarantees on the data instead at the transaction level but also allows

them to automatically switch consistency guarantees at runtime. They described a

dynamic consistency strategy, called Consistency Rationing , to reduce the consis-

tency requirements when possible (i.e., the penalty cost is low) and raise them when

it matters (i.e., the penalty costs would be too high). The adaptation is driven by a

cost model and different strategies that dictate how the system should behave. In

particular, they divide the data items into three categories (A; B; C ) and treat each

category differently depending on the consistency level provided. The A category

represents data items for which we need to ensure strong consistency guarantees

as any consistency violation would result in large penalty costs, the C category

represents data items that can be treated using session consistency as temporary

inconsistency is acceptable while the B category comprises all the data items where

the consistency requirements vary over time depending on the actual availability of

an item. Therefore, the data of this category is handled with either strong or session

consistency depending on a statistical-based policy for decision making. Keeton

et al. [ 106 , 159 ] have proposed a similar approach in a system called LazyBase

that allows users to trade off query performance and result freshness. LazyBase

breaks up metadata processing into a pipeline of ingestion, transformation, and

query stages which can be parallelized to improve performance and efficiency.

By breaking up the processing, LazyBase can independently determine how to

schedule each stage for a given set of metadata, thus providing more flexibility than

existing monolithic solutions. LazyBase uses models of transformation and query

performance to determine how to schedule transformation operations to meet users'

freshness and performance goals and to utilize resources efficiently.

In general, the simplicity of key-value stores comes at a price when higher levels

of consistency are required. In these cases, application programmers need to spend

extra time and exert extra effort to handle the requirements of their applications with

no guarantee that all corner cases are handled which consequently might result in

an error-prone application. In practice, data replication across different data centers

is expensive. Inter-datacenter communication is prone to variation in Round-Trip

Times (RTTs) and loss of packets. For example, RTTs are in the order of hundreds of

milliseconds. Such large RTTs causes the communication overhead that dominates

the commit latencies observed by users. Therefore, systems often sacrifice strong

consistency guarantees to maintain acceptable response times. Hence, many solu-

tions either rely on asynchronous replication mechanism and weaker consistency

guarantees. Some systems have been recently proposed to tackle these challenges.

For example, Google Megastore [ 72 ] has been presented as a scalable and highly

available datastore which is designed to meet the storage requirements of large

Cloud Data Management

Search WWH ::

Custom Search

Home