Architecture - Practical Cassandra

Database Reference

In-Depth Information

Bloom Filters

A bloom filter is a space-efficient probabilistic data structure that is used to de-

termine whether or not an element is a member of a set. False positives are pos-

sible. False negatives are not. A false positive means that the data structure thinks

the value is on the node when it actually is not. A false negative is when the bloom

filter thinks the data is not on a node when it actually is.

The reason that bloom filters are used in Cassandra is to determine whether

an SSTable has data for a particular row. They are used for index scans, but

not for range scans. On a per-ColumnFamily basis, the higher the

bloom_filter_fp_chance setting, the less memory will be used. However,

this will result in greater disk I/O as the SSTables get more highly fragmented.

It is important to note that starting in Cassandra version 1.2, bloom filters are

no longer stored on-heap. This means that they don't need to be taken into consid-

eration when determining the maximum memory sizes for the JVM.

Compaction Types

Initially, all data passed into Cassandra hits the disk via the CommitLog. Once the

CommitLog segment is complete, it gets rolled up (or compacted) into separate

SSTables. There are two common strategies for this compaction. There is the de-

fault type of size-tiered or the less commonly used type of leveled.

SizeTieredCompaction

The default type of compaction on Cassandra, SizeTieredCompaction, is made

for insert-heavy workloads that are lighter on the reads. The key issue with

SizeTieredCompaction is that it requires at least twice the available size on disk

in order to be used properly. In other words, if you have 400GB of data in your

SSTables on a 500GB drive, you will likely not be able to complete a compaction.

Compactions can take up to two times the size of the data on disk in the worst

of scenarios. The size of the SSTables being compacted is what determines how

much available disk space is required for the compaction.

Search WWH ::

Custom Search

Home