Database Reference
In-Depth Information
cassandra.yaml. If you have a large heap size and your use case is having Cas-
sandra under a write-heavy workload, this value can be safely increased.
On a similar note, the memtable_flush_queue_size has an effect on
the speed and efficiency of a MemTable flush. This value determines the number
of MemTables to allow to wait for a writer thread to flush to disk. This should
be set, at a minimum, to the maximum number of secondary indexes created on a
single ColumnFamily. In other words, if you have a ColumnFamily with ten sec-
ondary indexes, this value should be 10. The reason for this is that if the memt-
able_flush_queue_size is set to 2 and you have three secondary indexes
on a ColumnFamily, there are two options available: either flush the MemTable
and each index not updated during the initial flush will be out of sync with the
SSTable data, or the new SSTable won't be loaded into memory until all the in-
dexes have been updated. To avoid either of these scenarios, it is recommended
that the value of mem_table_flush_queue_size be set to ensure that all
secondary indexes for a ColumnFamily can be updated at flush time.
Concurrency
Cassandra was designed to have faster write performance than read performance.
One of the ways this is achieved is using threads for concurrency. There are two
settings in the cassandra.yaml that allow control over the amount of concurrency:
concurrent_reads and concurrent_writes . By default, both of these
values are set to 32. Since the writes that come to Cassandra go to memory,
the bottleneck for most performance is going to the disk for reads. To calcu-
late the best setting for concurrent_reads , the value should be 16 * $num-
ber_of_drives. If you have two drives, you can leave the default value of 32. The
reason for this is to ensure that the operating system has enough room to decide
which reads should be serviced at what time.
Coming back to writes, since they go directly to memory, the work is CPU
bound as opposed to being I/O bound like reads. To derive the best setting for
concurrent_writes , you should use 8 * number_of_cores in your system. If
you have a quad-core dual-processor system, the total number of cores is eight and
the concurrent_writes should be set at 64.
Search WWH ::




Custom Search