Database Reference
In-Depth Information
The right bloom filter setting depends on your workload. If you have an ana-
lytics cluster that does mostly range scanning, having bloom filters would not
be necessary. Also, using LeveledCompaction typically causes slightly less frag-
mentation within the SSTable than SizeTieredCompaction. Therefore, the de-
fault value of the
bloom_filter_fp_chance
can be slightly higher. Keep
in mind that memory savings are nonlinear. That means that going from a setting
of 0.01 to 0.1 saves one-third of the memory even though you are changing the
bloom_filter_fp_chance
by an order of magnitude.
In Cassandra version 1.2, bloom filters are stored off-heap. This means that
you don't need to think about the size of the bloom filters when attempting to
figure out the maximum memory size for the JVM. You can easily alter the
bloom_filter_fp_chance
setting on a per-ColumnFamily basis, as shown
in
Listing 6.3
.
Listing 6.3
Adjust the Bloom Filter False Positive Chance for an Existing
ColumnFamily
# ALTER TABLE events WITH bloom_filter_fp_chance =
0.01;
Once you update the
bloom_filter_fp_chance
for a ColumnFamily,
you need to regenerate the bloom filters. This can be done either by forcing a com-
paction or by running
upgradesstables
through
nodetool
.
Another good way to see if your bloom filter settings can be adjusted is through
a little bit of trial and error. If you do a
nodetool cfstats
, you will be able to
see the number of bloom filter false positives and the bloom filter false positive ra-
tio for a specific ColumnFamily. You want to minimize the number of bloom filter
false positives you get in general. But you also have a little leeway when adjusting
the
bloom_filter_fp_chance
before you actually start getting a significant
number of false positives. You will have to tune the value to see where your false
positive rate starts to increase.
System Tuning
Out of the box, Linux comes configured to run pretty well. Since running Cas-
sandra is not a normal workload for the basic server configuration, you can make
a few small tweaks and get a noticeable performance improvement.