Database Reference
In-Depth Information
tings should take into account schema (ColumnFamily and column layout)
in addition to overall memory.
Larger MemTables do not improve write performance. This is because
writes are happening to memory anyway. There is no way to speed up this
process unless your CommitLog and SSTables are on separate volumes. If
the CommitLog and SSTables were to share a volume, they would be in
contention for I/O.
Larger MemTables are better for unbatched writes. If you do batch writ-
ing, you will likely not see a large benefit. But if you do unbatched writes,
the compaction will have a better effect on the read performance as it will
do a better job of grouping like data together.
Larger MemTables lead to more effective compaction. Having a lot of
little MemTables is bad as it leads to a lot of turnover. It also leads to a lot
of additional seeks when the read requests hit memory.
The performance tuning of MemTables can double as a pressure release valve
if your Cassandra nodes start to get overloaded. They shouldn't be your only meth-
od of emergency release, but they can act as a good complement. In the cas-
sandra.yaml file, there is a setting called flush_largest_memtables_at .
The default setting is 0.75. This setting is a percentage. What is going on under
the hood is that every time a full garbage collection (GC) is completed, the heap
usage is checked. If the amount of memory used is still greater than (the default)
0.75, the largest MemTables will be flushed. This setting is more effective when
used under read-heavy workloads. In write-heavy workloads, there will probably
be too little memory freed too late in the cycle to be of significant value. If you
notice the heap filling up from MemTables frequently, you may need to either add
more capacity or adjust the heap setting in the JVM.
The memtable_total_space_in_mb setting in the cassandra.yaml is
usually commented out by default. When it is commented out, Cassandra will
automatically set the value to one-third the size of the heap. You typically don't
need to adjust this setting as one-third of the heap is sufficient. If you are in a
write-heavy environment, you may want to increase this value. Since you already
know the size of the JVM heap, you can just calculate what the new size of the
total space allotted for MemTables should be. Try not to be too aggressive here as
stealing memory from other parts of Cassandra can have negative consequences.
The setting of memtable_flush_writers is another one that comes unset
out of the box. By default, it's set to the number of data directories defined in the
Search WWH ::




Custom Search