Database Reference
In-Depth Information
NOTE
There's actually another failure mode that we've ignored here. When the ZooKeeper object is created,
it tries to connect to a ZooKeeper server. If the connection fails or times out, then it tries another server
in the ensemble. If, after trying all of the servers in the ensemble, it can't connect, then it throws an
IOException . The likelihood of all ZooKeeper servers being unavailable is low; nevertheless, some
applications may choose to retry the operation in a loop until ZooKeeper is available.
This is just one strategy for retry handling. There are many others, such as using exponen-
tial backoff, where the period between retries is multiplied by a constant each time.
A Lock Service
A distributed lock is a mechanism for providing mutual exclusion between a collection of
processes. At any one time, only a single process may hold the lock. Distributed locks can
be used for leader election in a large distributed system, where the leader is the process
that holds the lock at any point in time.
NOTE
Do not confuse ZooKeeper's own leader election with a general leader election service, which can be
built using ZooKeeper primitives (and in fact, one implementation is included with ZooKeeper).
ZooKeeper's own leader election is not exposed publicly, unlike the type of general leader election ser-
vice we are describing here, which is designed to be used by distributed systems that need to agree upon
a master process.
To implement a distributed lock using ZooKeeper, we use sequential znodes to impose an
order on the processes vying for the lock. The idea is simple: first, designate a lock znode,
typically describing the entity being locked on (say, /leader ); then, clients that want to ac-
quire the lock create sequential ephemeral znodes as children of the lock znode. At any
point in time, the client with the lowest sequence number holds the lock. For example, if
two clients create the znodes at /leader/lock-1 and /leader/lock-2 around the same time,
then the client that created /leader/lock-1 holds the lock, since its znode has the lowest se-
quence number. The ZooKeeper service is the arbiter of order because it assigns the se-
quence numbers.
The lock may be released simply by deleting the znode /leader/lock-1 ; alternatively, if the
client process dies, it will be deleted by virtue of being an ephemeral znode. The client
that created /leader/lock-2 will then hold the lock because it has the next lowest sequence
number. It ensures it will be notified that it has the lock by creating a watch that fires
when znodes go away.
Search WWH ::




Custom Search