Cassandra Architecture - Mastering Apache Cassandra

Database Reference

In-Depth Information

Note

Major compaction may not be the best idea after Cassandra v0.8+. There are a couple of

reasons for this. One reason is that automated minor compaction no longer runs after a

major compaction is executed. So, this adds up manual intervention or doing extra work

(such as setting a cron job) to perform regular major compaction. The performance gain

after major compaction may deteriorate with time. Probably because of the larger the

SSTable, which is what we get after major compaction, it is more likely to get more

bloom filter false positive. And then, it will take longer to perform binary search on the

index, which is very big.

Tombstones

Cassandra is a complex system with its data distributed among commit logs, MemTables,

and SSTables on a node. The same data is then replicated over replica nodes. So, like

everything else in Cassandra, deletion is going to be eventful. Deletion, to an extent, fol-

lows an update pattern, except Cassandra tags the deleted data with a special value, and

marks it as a tombstone. This marker helps future queries, compaction, and conflict resol-

ution. Let's step further down and see what happens when a column from a column family

is deleted.

A client connected to a node (a coordinator node may not be the one holding the data that

we are going to mutate), issues a delete command for a column C, in a column family CF.

If the consistency level is satisfied, the delete command gets processed. When a node,

containing the row key receives a delete request, it updates or inserts the column in

MemTable with a special value, namely tombstone. The tombstone basically has the same

column name as the previous one; the value is set to the Unix epoch. The timestamp is set

to what the client has passed. When a MemTable is flushed to SSTable, all tombstones go

into it as any regular column will.

On the read side, when the data is read locally on the node and it happens to have multiple

versions of it in different SSTables, they are compared and the latest value is taken as the

result of reconciliation. If a tombstone turns out to be a result of reconciliation, it is made

a part of the result that this node returns. So, at this level, if a query has a deleted column,

this exists in the result. But the tombstones will eventually be filtered out of the result be-

fore returning it back to the client. So, a client can never see a value that is a tombstone.

For consistency levels more than one, the query is executed on as many replicas as the

consistency level. The same as a regular read process, data from the closest node and a di-

Search WWH ::

Custom Search

Home