Performance Tuning - Mastering Apache Cassandra

Database Reference

In-Depth Information

Compression setting is table-wise; if you do not mention any compression mechanism,

LZ4Compressor is applied to the table by default. This is how you alter compression

type (see details about assigning compression setting when the table is created in Chapter

3 , Effective CQL ):

ALTER TABLE users

WITH

COMPRESSION = {

'sstable_compression': 'DeflateCompressor'

};

Let's see the compression options we have.

The sstable_compression parameter specifies which compressor is used to com-

press disk representation of SSTable, when MemTable is flushed (compression takes place

at the time of flush). Cassandra Version 2.1.0 provides three compressors out of the box:

LZ4Compressor , SnappyCompressor , and DeflateCompressor .

The LZ4Compressor is 50 percent faster than SnappyCompressor , which is faster

than DeflateCompressor . In general, this means, when you move from De-

flateCompressor to LZ4Compressor , the compression will take a little extra

space, but it will have higher read speed.

Like everything else in Cassandra, compressors are pluggable. You can write your own

compressor by implementing

org.apache.cassandra.io.compress.ICompressor , compiling the com-

pressor, and putting the .class or .jar files in the lib directory. Provide the fully-

qualified class name of the compression as the sstable_compression value.

The chunk length ( chunk_length_kb ) is the smallest slice of the row that gets decom-

pressed during reads. Depending on the query pattern and median size of the rows, this

parameter can be tweaked in such a way that it is big enough to not have to deflate mul-

tiple chunks, but small enough to not have to decompress excessive unnecessary data.

Practically, it is hard to guess this. The most common suggestion is to keep it 64 KB, if

you do not have any idea.

Compression can be added, removed, or altered anytime during the lifetime of a table. In

general, compression always boosts performance and it is a great way to maximize the

utilization of disk space. Compression gives double to quadruple reduction in data size

Search WWH ::

Custom Search

Home