Databases Reference
In-Depth Information
FIGURE 4.26
Cassandra ring architecture.
Data placement
Data placement around the ring is not fixed in any default configuration. Cassandra provides two
components called snitches and strategies, to determine which nodes will receive copies of data.
Snitches define the proximity of nodes within the ring and provide information on the network topology.
Strategies use the information snitches provide them about node proximity along with an
implemented algorithm to collect nodes that will receive writes.
Data partitioning
Data is distributed across the nodes by using partitioners. Since Cassandra is based on a ring topology
or architecture, the ring is divided into ranges equal to the number of nodes, where each node can be
responsible for one or more ranges of the data. When a node is joined to a ring, a token is issued, and
this token determines the node's position on the ring and assigns the range of data it is responsible
for. Once the assignment is done, we cannot undo it without reloading all the data.
Cassandra provides native partitioners and supports any user-defined partitioner. The key feature
difference in the native partitioner is the order preservation of keys.
Random partitioner. This is the default choice for Cassandra. It uses an MD5 hash function to
map keys into tokens, which will evenly distribute across the clusters. Random partition hashing
techniques ensure that when nodes are added to the cluster, the least possible set of data is
affected. While the keys are evenly distributed, there is no ordering of the data, which will need
the query to be processed by all nodes in an operation.
 
Search WWH ::




Custom Search