Database Reference
In-Depth Information
gossip, that a node is recovered, it replays all the hints it has in store for that node. Also,
every 10 minutes, it keeps checking any pending hinted handoffs to be written.
Why worry about hinted hand off when you have written to satisfy the consistency level?
Wouldn't it eventually get repaired? Yes, that's right. Also, hinted handoff may not be the
most reliable way to repair a missed write. What if the node that has hinted handoff dies?
This is a reason we do not count on hinted handoff as a mechanism to provide consistency
(except for the case of the consistency level, ANY ) guarantee; it's a single point of failure.
The purposes of hinted handoff are one- to make restored nodes quickly consistent with
the other live ones; and two to provide extreme write availability when consistency is not
required.
The way extreme write availability is obtained is at the cost of consistency. One can set
consistency level for writes to ANY . What happens next is that if all the replicas that are
meant to hold this value are down, Cassandra will just write a local hinted handoff and re-
turn write success to the client. There is one caveat; the handoff can be on any node. So, a
read for the data that we have written as a hint will not be available as long as the replicas
are dead plus until the hinted handoff is replayed. But it is a nice feature.
Note
There is a slight difference where hinted handoff is stored in Cassandra's different ver-
sions. Prior to Cassandra 1.0, hinted handoff is stored on one of the replica nodes that can
be communicated with. From Version 1.0+ (including 1.0), handoff can be written on the
coordinator node (the node that the client is connected to).
Removing a node from a cluster causes deletion of hinted handoff stored for that node. All
hints for deleted records are dropped.
Read repair and anti-entropy
Cassandra promises eventual consistency and read repair is the process that does this part.
Read repair, as the name suggests, is the process of fixing inconsistencies among the rep-
licas at the time of read. What does that mean? Let's say we have three replica nodes, A,
B, and C, that contain a data X. During an update, X is updated to X1 in replicas A and B,
but it fails in replica C for some reason. On a read request for data X, the coordinator node
asks for a full read from the nearest node (based on the configured snitch) and digest of
data X from other nodes to satisfy consistency level. The coordinator node compares these
values (something like digest(full_X) == digest_from_node_C ). If it turns
out that the digests are the same as the digests of the full read, the system is consistent and
Search WWH ::




Custom Search