Database Reference
In-Depth Information
The Apache Curator project also provides an extensive set of ZooKeeper recipes, as well
as a simplified ZooKeeper client.
BookKeeper and Hedwig
BookKeeper is a highly available and reliable logging service. It can be used to provide
write-ahead logging, which is a common technique for ensuring data integrity in storage
systems. In a system using write-ahead logging, every write operation is written to the
transaction log before it is applied. Using this procedure, we don't have to write the data
to permanent storage after every write operation, because in the event of a system failure,
the latest state may be recovered by replaying the transaction log for any writes that were
not applied.
BookKeeper clients create logs called ledgers , and each record appended to a ledger is
called a ledger entry , which is simply a byte array. Ledgers are managed by bookies ,
which are servers that replicate the ledger data. Note that ledger data is not stored in
ZooKeeper; only metadata is.
Traditionally, the challenge has been to make systems that use write-ahead logging robust
in the face of failure of the node writing the transaction log. This is usually done by rep-
licating the transaction log in some manner. HDFS high availability, described , uses a
group of journal nodes to provide a highly available edit log. Although it is similar to
BookKeeper, it is a dedicated service written for HDFS, and it doesn't use ZooKeeper as
the coordination engine.
Hedwig is a topic-based ipublish-subscribe system built on BookKeeper. Thanks to its
ZooKeeper underpinnings, Hedwig is a highly available service and guarantees message
delivery even if subscribers are offline for extended periods of time.
BookKeeper is a ZooKeeper subproject, and you can find more information on how to use
it, as well as Hedwig, at http://zookeeper.apache.org/bookkeeper/ .
Search WWH ::




Custom Search