Database Reference
In-Depth Information
Virtual nodes replace physical nodes in the partitioned ring; each virtual node owns a por-
tion of the token space. Virtual nodes themselves are then owned by physical nodes, but a
physical node does not own a contiguous range of virtual nodes; rather, virtual nodes are
distributed randomly among physical nodes. Crucially, there are many more virtual nodes
than physical nodes; each physical machine is responsible for many virtual nodes.
Looking back at our three-node data cluster, let's examine how data is distributed using
virtual nodes. For simplicity, we'll say there are twelve virtual nodes in the cluster—four
per physical node—although in a real cluster that number would be much higher. The
makers of Cassandra recommend 256 virtual nodes in a production cluster.
Here's how our ring now looks, divided up into twelve virtual nodes:
Note that each physical machine is no longer responsible for a contiguous range of tokens;
instead, each machine is responsible for four virtual nodes, each of which covers a differ-
ent token range.
Virtual nodes facilitate redistribution
The main advantage of virtual nodes is their behavior when the cluster changes. Consider
the simple example of adding a fourth node to our three-node cluster. Without virtual
nodes, all three preexisting physical nodes have their token range changed to make space
for the fourth node. To accommodate the fourth machine, Cassandra must recalculate the
Search WWH ::




Custom Search