Database Reference
In-Depth Information
Replica placement strategies
Apart from putting data in various buckets on the basis of nodes' tokens, Cassandra has to
replicate the data depending on what replication factor is associated with the keyspace.
Replica placement strategies come into action when Cassandra has to decide where a rep-
lica should be placed.
There are two strategies that can be used on the basis of the demand and structure of the
cluster.
SimpleStrategy
SimpleStrategy places the data on the node that owns it on the basis of the configured
partitioner. It then moves to the next node (toward a higher bucket), places a replica, moves
to next node and places another, and so on, until the replication factor is met.
SimpleStrategy is blind to cluster topology. It does not check whether the next node
to place the replica in is in the same rack or not. Thus, this may not be the most robust
strategy to use to store data. What happens if all the three replicas of a key range are phys-
ically located in the same rack (assuming RF=3 ) and there is a power failure on that rack?
You lose access to some data until power is restored. This leads us into a rather smarter
strategy, NetworkTopologyStrategy .
Although we discussed how bad SimpleStrategy could be, this is the default strategy.
In addition, if you do not know the placement or any configuration details of your data cen-
ter and you decide to stay in a single data center, NetworkTopologyStrategy cannot
help you much.
NetworkTopologyStrategy
NetworkTopologyStrategy , as the name suggests, is a data center- and rack-aware
replica placement strategy. NetworkTopologyStrategy tries to avoid the pitfalls of
SimpleStrategy by considering the rack name and data center names that it figures out
from the configured snitch. With the appropriate strategy_option , stating how many
replicas go to which data centers makes NetworkTopologyStrategy a very powerful
and robust mirrored database system.
NetworkTopologyStrategy requires the system admin to put a little extra thought
when deciding appropriate values for initial tokens for multiple data center installations.
Search WWH ::




Custom Search