Database Reference
In-Depth Information
Reply Timeout
The reply timeout is a setting that indicates how long Cassandra will wait for other nodes to
respond before deciding that the request is a failure. This is a common setting in relational
databases and messaging systems. This value is set by the RpcTimeoutInMillis element
( rpc_timeout_in_ms in YAML). By default, this is 5,000, or five seconds.
Commit Logs
You can set the value for how large the commit log is allowed to grow before it stops appending
new writes to a file and creates a new one. This is similar to setting log rotation on Log4J.
This value is set with the CommitLogRotationThresholdInMB element ( commit-
log_rotation_threshold_in_mb in YAML). By default, the value is 128MB.
Another setting related to commit logs is the sync operation, represented by the commit-
log_sync element. There are two possible settings for this: periodic and batch . periodic
is the default, and it means that the server will make writes durable only at specified intervals.
When the server is set to make writes durable periodically, you can potentially lose the data that
has not yet been synced to disk from the write-behind cache.
In order to guarantee durability for your Cassandra cluster, you may want to examine this setting.
If your commit log is set to batch , it will block until the write is synced to disk (Cassandra will
not acknowledge write operations until the commit log has been completely synced to disk). This
clearly will have a negative impact on performance.
You can change the value of the configuration attribute from periodic to batch to specify that
Cassandra must flush to disk before it acknowledges a write. Changing this value will require
taking some performance metrics, as there is a necessary trade-off here: forcing Cassandra to
write more immediately constrains its freedom to manage its own resources. If you do set com-
mitlog_sync to batch , you need to provide a suitable value for CommitLogSyncBatchWin-
dowInMS , where MS is the number of milliseconds between each sync effort. Moreover, this is
not generally needed in a multinode cluster when using write replication, because replication by
definition means that the write isn't acknowledged until another node has it.
If you decide to use batch mode, you will probably want to split the commit log onto a separate
device to mitigate the performance impact. It's a good idea to split it out onto a separate disk
from the SSTables (data) anyway, even if you don't do this.
Search WWH ::




Custom Search