Conclusions - Cloud Data Management

Database Reference

In-Depth Information

with a single node. Thus, the implementation of the transaction manager component

does not require any distributed synchronization and is similar to the transaction

manager of any single node relational database management systems. The key

difference is that in G-Store, transactions are limited to smaller logical entities

(key groups). A similar approach has been followed by the Google Megastore

system [ 72 ]. It implements a transactional record manager on top of the BigTable

data store [ 99 ] and provides transaction support across multiple data items where

programmers have to manually link data items into hierarchical groups and each

transaction can only access a single group. Megastore partitions the data into a

collection of entity groups , a priori user-defined grouping of data for fast operations,

where each group is independently and synchronously replicated over a wide area.

In particular, Megastore tables are either entity group root tables or child tables.

Each child table must declare a single distinguished foreign key referencing a root

table. Thus, each child entity references a particular entity in its root table (called

the root entity). An entity group consists of a root entity along with all entities

in child tables that reference it. Entities within an entity group are mutated with

single- phase ACID transactions (for which the commit record is replicated via

Paxos). Operations across entity groups could rely on expensive two-phase commit

operations but they could leverage the built-in Megastore's efficient asynchronous

messaging to achieve these operations. Google's Spanner [ 113 ] has been presented

as a scalable and globally-distributed database that shards data across many sets of

Paxos state machines in datacenters which are spread all over the world. Spanner

automatically reshards data across machines as the amount of data or the number

of servers changes, and it automatically migrates data across machines (even across

datacenters) to balance load and in response to failures. It supports general-purpose

transactions, and provides a SQL-based query language.

Deuteronomy [ 173 ] have presented a radically different approach towards scaling

databases and supporting transactions in the cloud by unbundling the database into

two components: (1) The transactional component (TC) that manages transactions

and their concurrency control and undo/redo recovery but knows nothing about

physical data location. (2) The data component (DC) that maintains a data cache and

uses access methods to support a record-oriented interface with atomic operations

but knows nothing about transactions. Applications submit requests to the TC

which uses a lock manager and a log manager to logically enforce transactional

concurrency control and recovery. The TC passes requests to the appropriate Data

Component (DC). The DC, guaranteed by the TC to never receive conflicting

concurrent operations, needs to only support atomic record operations, without

concern for transaction properties that are already guaranteed by the TC. In this

architecture, data can be stored anywhere (e.g., local disk, in the cloud, etc) as the

TC functionality in no way depends on where the data is located. The TC and DC

can be deployed in a number of ways. Both can be located within the client, and that

is helpful in providing fast transactional access to closely held data. The TC could

be located with the client while the DC could be in the cloud, which is helpful in

case a user would like to use its own subscription at a TC service or wants to perform

transactions that involve manipulating data in multiple locations. Both TC and DC

Cloud Data Management

Search WWH ::

Custom Search

Home