Database Reference
In-Depth Information
Cassandra Version 1.1.1 uses SnapTree (
https://github.com/nbronson/snaptree
)
for MemT-
able representation, which claims to be "A drop-in replacement for
Concur-
rentSkipListMap
, with the additional guarantee that
clone()
is atomic and itera-
tion has snapshot isolation". See also, copy-on-write and compare-and-swap on the fol-
lowing sites:
Note
SnapTree is very likely to be replaced by Btree implementation. It is implemented in Cas-
sandra 2.1 beta version, so it is likely to be default in future. For more information, visit
Any write gets first written to the commit log and then to MemTable.
SSTable
SSTable is a disk representation of the data. MemTables get flushed to disk to immutable
SSTables. MemTables get flushed to individual SSTables, and all the writes are sequen-
tial, which makes this process fast. So, the faster the disk speed, the quicker the flush op-
eration.
The SSTables eventually get merged in the compaction process and the data gets organ-
ized properly into one file. This extra work in compaction pays off during reads.
SSTables have three components: bloom filter, index files, and data files.
The bloom filter
The bloom filter is a litmus test for the availability of certain data in storage (collection).
But unlike a litmus test, a bloom filter may result in false positives; that is, it says that
data exists in the collection associated with the bloom filter, when it actually does not. A
bloom filter never results in a false negative; that is, it never states that data is not there
when it is. The reason to use a bloom filter, even with its false-positive defect, is because
it is very fast and its implementation is really simple.
Cassandra uses bloom filters to determine whether an SSTable has the data for a particular
row key. Bloom filters are unused for range scans, but they are good candidates for index
scans. This saves a lot of disk I/O that might take in a full SSTable scan, which is a slow