Performance Tuning - Cassandra: The Definitive Guide

Database Reference

In-Depth Information

to improve performance. The cost is that bootstrapping can take longer if there is considerable

data in the column family to preload.

The rows_cached setting specifies the number of rows that will be cached. By default, this value

is set to 0 , meaning that no rows will be cached, so it's a good idea to turn this on. If you use

a fraction, you're indicating a percentage of everything to cache, and an integer value indicates

an absolute number of rows whose locations will be cached. You'll want to use this setting care-

fully, however, as this can easily get out of hand. If your column family gets far more reads than

writes, then setting this number very high will needlessly consume considerable server resources.

If your column family has a lower ratio of reads to writes, but has rows with lots of data in them

(hundreds of columns), then you'll need to do some math before setting this number very high.

And unless you have certain rows that get hit a lot and others that get hit very little, you're not

going to see much of a boost here.

Bufer Sizes

The buffer sizes represent the memory allocation when performing certain operations. The fol-

lowing is a quick overview of these settings:

flush_data_buffer_size_in_mb

By default, this is set to 32 megabytes and indicates the size of the buffer to use when memt-

ables get flushed to disk.

flush_index_buffer_size_in_mb

By default, this is set to 8 megabytes. If each key defines only a few columns, then it's a good

idea to increase the index buffer size. Alternatively, if your rows have many columns, then

you'll want to decrease the size of the buffer.

sliced_buffer_size_in_kb

Depending on how variable your queries are, this setting is unlikely to be very useful. It al-

lows you to specify the size, in kilobytes, of the buffer to use when executing slices of ad-

jacent columns. If there is a certain slice query that you perform far more than others, or

if your data is laid out with a relatively consistent number of columns per family, then this

setting could be moderately helpful on read operations. But note that this setting is defined

globally.

Search WWH ::

Custom Search

Home