Database Reference
In-Depth Information
target node for each individual row based on the mapping from its token to the new token
range assignments of the four nodes. This is a process known as rebalancing , and it's
rendered unnecessary by virtual nodes.
When a new node joins the cluster, it's simply assigned a handful of virtual nodes that pre-
viously belonged to other machines. Rather than directly recalculating the physical loca-
tion of each individual row, Cassandra can simply assign the correct number of virtual
nodes to the new machine—in this case, three—and move their contents over wholesale.
Unlike in a rebalancing scenario, where every physical machine is both losing and gaining
data, redistributing virtual nodes only requires data to be moved from the original three
machines to its new home on the fourth machine. Here's how the ring will now look:
Note
While a treatment of virtual nodes is important to cultivate a complete understanding of
how Cassandra data distribution works, it's worth emphasizing that the process of accom-
modating changes to cluster topology—such as adding, removing, or replacing nodes—is
entirely transparent to the application. Nodes can be added to, or removed from, a live
Cassandra cluster with no degradation of functionality from the application's standpoint.
The same is true for unexpected changes to the cluster, such as the failure of a node,
thanks to Cassandra replication, which we'll cover next.
Search WWH ::




Custom Search