Database Reference
In-Depth Information
More tuning via cassandra.yaml
The cassandra.yaml file is the hub of almost all the global settings for the node or the
cluster. It is well-documented, and one can understand very easily by reading it. Listed in
the following sections are some of the properties from Cassandra Version 2.1.0, and short
descriptions of it. You should refer to the cassandra.yaml file of your version of Cas-
sandra and read the details.
commitlog_sync
Durability—as we know from Chapter 2 , Cassandra Architecture —provides durable writes
by the virtue of appending the new writes to the commit logs. This is not entirely true. To
guarantee that each write is made in such a manner that a hard reboot/crash does not wash
off any data, it must be fsync'd to the disk. Flushing commit logs after each write is detri-
mental to write performance due to slow disk seeks. Instead of doing that, Cassandra peri-
odically (by default, commitlog_sync: periodic ) flushes the data to the disk after
an interval described by commitlog_sync_period_in_ms in milliseconds.
However, Cassandra does not wait for commit log to synchronize; it immediately acknow-
ledges the write.
This means that if a heavy write is going on and the machine crashes, at the most, you will
lose the data written in the commitlog_sync_period_in_ms window. You should
not really worry. We have a replication factor and consistency level to help recover this
loss; unless you are unlucky enough that all the replicas die in the same instant.
Note
The fsync function transfers (flushes) all modified in-core data of (that is, modified buf-
fer cache pages for) the file referred to by the file descriptor ( fd ) to the disk device (or oth-
er permanent storage device) so that all changed information can be retrieved even after the
system was crashed or rebooted. For more information, visit http://linux.die.net/man/2/
fsync .
The commitlog_sync setting gives high performance at some risk. To someone who is
paranoid about data loss, Cassandra provides a guarantee write option. Set commit-
log_sync to batch mode. In batch mode, Cassandra accrues all the writes to go to the
commit log, and then fsyncs after commitlog_sync_batch_window_in_ms , which
is usually set smaller such as 50 milliseconds. This prevents the problem of flushing to the
disk after every write, but the durability guarantee forces the acknowledgement (that the
Search WWH ::




Custom Search