Database Reference
In-Depth Information
not see the written value until a long time after you wrote it. If any node in the cluster is alive,
ANY should succeed.
NOTE
If you're new to Cassandra, the replication factor can sometimes be confused with the consistency level.
The replication factor is set per keyspace, and is specified in the server's config file. The consistency level
is specified per query, by the client. The replication factor indicates how many nodes you want to use to
store a value during each write operation. The consistency level specifies how many nodes the client has
decided must respond in order to feel confident of a successful read or write operation. The confusion
arises because the consistency level is based on the replication factor, not on the number of nodes in the
system.
Increasing the Replication Factor
The replication factor is not a setting that is intended to be changed on a live cluster and should
be determined ahead of time. But as your application grows and you need to add nodes, you
can increase the replication factor. There are some simple guidelines to follow when you do this.
First, keep in mind that you'll have to restart the nodes after a simple increase in the replication
factor value. A repair will then be performed after you restart the nodes, as Cassandra will have
to redistribute some data in order to account for the increased replication factor. For as long as
the repair takes, it is possible that some clients will receive a notice that data does not exist if
they connect to a replica that doesn't have the data yet.
A faster way of increasing the replication factor from 1 to 2 is to use the node tool. First, execute
a drain on the original node to ensure that all the data is flushed to the SSTables. Then, stop that
node so it doesn't accept any more writes. Next, copy the datafiles from your keyspaces (the files
under the directory that is your value for the DataFileDirectory element in the config). Make
sure not to copy the values in the internal Cassandra keyspace. Paste those datafiles to the new
node. Change the settings in the configuration of both nodes so that the replication factor is set
to 2. Make sure that autobootstrap is set to false in both nodes. Then, restart both nodes and
run node tool repair. These steps will ensure that clients will not have to endure the potential of
false empty reads for as long.
To illustrate this, I have three nodes with IP addresses ending in 1.5, 1.7, and 1.8, and their rep-
lication factor is set to 1. I will connect to node 1.5 and perform a write to a column that hasn't
previously existed anywhere:
cassandra> connect 192.168.1.5/9160
Connected to: "TDG Cluster" on 192.168.1.5/9160
Search WWH ::




Custom Search