Database Reference
In-Depth Information
DOES ZOOKEEPER USE PAXOS?
No. ZooKeeper's Zab protocol is not the same as the well-known Paxos algorithm. [ 145 ] Zab is similar,
but it differs in several aspects of its operation, such as relying on TCP for its message ordering guaran-
tees. [ 146 ]
Google's Chubby Lock Service, [ 147 ] which shares similar goals with ZooKeeper, is based on Paxos.
If the leader fails, the remaining machines hold another leader election and continue as be-
fore with the new leader. If the old leader later recovers, it then starts as a follower. Leader
election is very fast, around 200 ms according to one published result , so performance
does not noticeably degrade during an election.
All machines in the ensemble write updates to disk before updating their in-memory cop-
ies of the znode tree. Read requests may be serviced from any machine, and because they
involve only a lookup from memory, they are very fast.
Consistency
Understanding the basis of ZooKeeper's implementation helps in understanding the con-
sistency guarantees that the service makes. The terms “leader” and “follower” for the ma-
chines in an ensemble are apt because they make the point that a follower may lag the
leader by a number of updates. This is a consequence of the fact that only a majority and
not all members of the ensemble need to have persisted a change before it is committed. A
good mental model for ZooKeeper is of clients connected to ZooKeeper servers that are
following the leader. A client may actually be connected to the leader, but it has no control
over this and cannot even know if this is the case. [ 148 ] See Figure 21-2 .
Search WWH ::




Custom Search