Database Reference
In-Depth Information
gest from the remaining nodes is obtained (to satisfy the consistency level). If there is a
mismatch, such as the tombstone not yet being propagated to all the replicas, a partial read
repair is triggered, where the final view of the data is sent to all the nodes that were in-
volved in this read, to satisfy the consistency level.
One thing where delete differs from update is a compaction. A compaction removes
a tombstone only if its (the tombstone's) garbage collection's grace seconds (t) are over.
This t is called gc_grace_seconds (configurable). So, do not expect that a major de-
letion will free up a lot of space immediately.
What happens to a node that was holding data that was deleted (in other live replicas)
when this node was down? If a tombstone still exists in any of the replica nodes, the de-
lete information will eventually be available to the previously dead node. But a compac-
tion occurs at gc_grace_seconds , after the deletion will kick the old tombstones out.
This is a problem, because no information about the deleted column is left. Now, if a node
that was dead all the time during gc_grace_seconds wakes up and sees that it has
some data that no other node has, it will treat this data as fresh data, and assuming a write
failure, it will replicate the data over all the other replica nodes. The old data will resurrect
and replicate, and may reappear in client results.
gc_grace_seconds is 10 days by default, before which any sane system admin will
bring the node back in, or discard the node completely. But it is something to watch out
for and repair nodes occasionally.
Hinted handoff
When we last talked about durability, we observed that Cassandra provides a commit log
to provide write durability. This is good. But what if the node, where the writes are going
to be, is itself dead? No communication will keep anything new to be written to the node.
Cassandra, inspired by Dynamo, has a feature called "hinted handoff". In short, it's the
same as taking a quick note locally that X cannot be contacted; here is the mutation, M,
that will be required to be replayed when it comes back.
The coordinator node (the node which the client is connected to) on receipt of a mutation/
write request forwards it to appropriate replicas that are alive. If this fulfills the expected
consistency level, the write is assumed successful. The write requests a node that does not
respond to a write request or is known to be dead (via gossip) and is stored locally in the
system.hints table. This hint contains the mutation. When a node comes to know, via
Search WWH ::




Custom Search