Database Reference
In-Depth Information
flexible API calls give a degree of freedom to applications and users to choose their
consistency guarantees and control their availability, consistency, and performance
tradeoffs.
10.4.4 g oogle s Panner
Spanner [46] is a scalable, globally distributed database that provides synchronous
replication and ensures strong consistency. While many applications within Google
require georeplication for global availability and geographical locality reasons, a
large class of these applications still needs a strong consistency and an SQL-like
query model. Google BigTable [6] still serves and manages data efficiently for many
applications, but it only guarantees eventual consistency at global scale and provides
a NoSQL API. Therefore, Spanner is designed to overcome BigTable insufficien-
cies for the aforementioned class of applications and provides global scale external
consistency (linearizability) and SQL-like query language similar to that of Google
Megastore [42]. Data is stored into semi-relational tables to support an SQL-like
query language and general-purpose transactions.
The Spanner architecture consists of a universe that may contain several zones
where zones are the unit of administrative deployment. A zone additionally presents
a location where data may be replicated. Each zone encapsulates a set of spanserv-
ers that host data tables that are split into data structures called tablets . Spanner
timestamps data to provide multi-versioning features. A zonemaster is responsible
for assigning data to spanservers whereas, the location proxies components provide
clients with information to locate the spanserver responsible for its data. Moreover,
Spanner introduces an additional data abstraction called directories , which are
kinds of buckets to gather data that have the same access properties. The directory
abstraction is the unit used to perform and optimize data movement and location.
Replication is supported by implementing a Paxos protocol. Each spanserver asso-
ciates a Paxos state machine with a tablet. The set of replicas for a given tablet is
called a Paxos group . For each tablet and its replicas, a long-lived Paxos leader
is designated with a time-based leader lease. The Paxos state machines are used
to keep a consistent state of replicas. Therefore, writes must all initiate the Paxos
protocol at the level of the Paxos leader while reads can access Paxos states at any
replica that is sufficiently up-to-date. At the level of the leader replica, a lock table is
used to manage concurrency control based on a two-phase locking (2PL) protocol.
Consequently, all operations that require synchronization should acquire locks at
the lock table.
To manage the global ordering and external consistency, Spanner relies on a time
API called Tr ueTime . This API exposes clock uncertainty and allows Spanner to
assign globally meaningful commit timestamps. The clock uncertainty is kept small
within the TrueTime API relying on atomic clocks and GPS-based clocks at the level
of every datacenter. Moreover, when uncertainty grows to a large value, Spanner
slows down to wait out that uncertainty. The TrueTime API is then used to guarantee
spanner desired correctness properties for concurrent executions, therefore, provid-
ing external consistency (linearizability) while enabling lock-free read-only transac-
tions and nonblocking reads in the past.
Search WWH ::




Custom Search