Database Reference
In-Depth Information
The right bloom filter setting depends on your workload. If you have an ana-
lytics cluster that does mostly range scanning, having bloom filters would not
be necessary. Also, using LeveledCompaction typically causes slightly less frag-
mentation within the SSTable than SizeTieredCompaction. Therefore, the de-
fault value of the bloom_filter_fp_chance can be slightly higher. Keep
in mind that memory savings are nonlinear. That means that going from a setting
of 0.01 to 0.1 saves one-third of the memory even though you are changing the
bloom_filter_fp_chance by an order of magnitude.
In Cassandra version 1.2, bloom filters are stored off-heap. This means that
you don't need to think about the size of the bloom filters when attempting to
figure out the maximum memory size for the JVM. You can easily alter the
bloom_filter_fp_chance setting on a per-ColumnFamily basis, as shown
in Listing 6.3 .
Listing 6.3 Adjust the Bloom Filter False Positive Chance for an Existing
ColumnFamily
Click here to view code image
# ALTER TABLE events WITH bloom_filter_fp_chance =
0.01;
Once you update the bloom_filter_fp_chance for a ColumnFamily,
you need to regenerate the bloom filters. This can be done either by forcing a com-
paction or by running upgradesstables through nodetool .
Another good way to see if your bloom filter settings can be adjusted is through
a little bit of trial and error. If you do a nodetool cfstats , you will be able to
see the number of bloom filter false positives and the bloom filter false positive ra-
tio for a specific ColumnFamily. You want to minimize the number of bloom filter
false positives you get in general. But you also have a little leeway when adjusting
the bloom_filter_fp_chance before you actually start getting a significant
number of false positives. You will have to tune the value to see where your false
positive rate starts to increase.
System Tuning
Out of the box, Linux comes configured to run pretty well. Since running Cas-
sandra is not a normal workload for the basic server configuration, you can make
a few small tweaks and get a noticeable performance improvement.
 
Search WWH ::




Custom Search