Database Reference
In-Depth Information
Region 1
Region 2
Message
broker
Routers
Routers
Tablet
controller
Storage units
Storage units
Fig. 3.3
PNUTS system architecture
and a complete copy of each table. Regions are typically, but not necessarily,
geographically distributed. Therefore, at the physical level, data tables are horizon-
tally partitioned into groups of records called tablets . These tablets are scattered
across many servers where each server might have hundreds or thousands of tablets.
The assignment of tablets to servers is flexible in a way that allows balancing the
workloads by moving a few tablets from an overloaded server to an under-loaded
server.
The query language of PNUTS supports selection and projection from a single
table. Operations for updating or deleting existing record must specify the primary
key. The system is designed primarily for online serving workloads that consist
mostly of queries that read and write single records or small groups of records.
Thus, it provides a multiget operation which supports retrieving multiple records
in parallel by specifying a set of primary keys and an optional predicate. The
router component (Fig. 3.3 ) is responsible of determining which storage unit need
to be accessed for a given record to be read or written by the client. Therefore,
the primary-key space of a table is divided into intervals where each interval
corresponds to one tablet. The router stores an interval mapping which defines the
boundaries of each tablet and maps each tablet to a storage unit. The query model of
PNUTS does not support join operations which are too expensive in such massive
scale systems.
The PNUTS system does not have a traditional database log or archive data.
However, it relies on a pub/sub mechanism that act as a redo log for replaying
updates that are lost before being applied to disk due to failure. In particular,
PNUTS provides a consistency model that is between the two extremes of general
serializability and eventual consistency [ 226 ]. The design of this model is derived
from the observation that web applications typically manipulate one record at a time
while different records may have activity with different geographic locality. Thus, it
provides per-record timeline consistency where all replicas of a given record apply
all updates to the record in the same order. In particular, for each record, one of the
replicas (independently) is designated as the master where all updates to that record
are forwarded to the master. The master replica for a record is adaptively changed
to suit the workload where the replica receiving the majority of write requests
 
Search WWH ::




Custom Search