Database Reference
In-Depth Information
does not guarantee any ordering. This means that two lexicographically close row keys
can possibly be thrown into two different nodes. This random assignment of a token to a
key is what makes it suitable for the even distribution of keys among nodes. This means it
is highly unlikely that a balanced node is ever going to have hotspots.
The number of keys generated by a Random partitioner varies from 0 to 2127 - 1. There-
fore, for the i th node in an N -node cluster, the initial token can be calculated by 2127 * (i -
1) / N .
Note
Remember, initial_token is for a non-vnode configuration. It is highly likely that
you are using one with vnodes, that is, the num_tokens setting, so you need not bother
about the calculation below or the calculations in subsequent partitioners done for nodes.
The following is simple Python code to generate the complete sequence of initial tokens
for a Random partitioner of a cluster of eight nodes:
# running in Python shell
>>> nodes = 8
>>> print ("\n".join(["Node #" + str(i+1) +": " + str((2 **
127)*i/nodes) for i in xrange(nodes) ]))
Node #1: 0
Node #2: 21267647932558653966460912964485513216
Node #3: 42535295865117307932921825928971026432
Node #4: 63802943797675961899382738893456539648
Node #5: 85070591730234615865843651857942052864
Node #6: 106338239662793269832304564822427566080
Node #7: 127605887595351923798765477786913079296
Node #8: 148873535527910577765226390751398592512
The Byte-ordered partitioner
A Byte-ordered partitioner, as the name suggests, generates tokens for row keys that are in
the order of hexadecimal representations of the key. This makes it possible for rows to be
ordered by row keys and iterate through rows as iterating through an ordered list.
However, this benefit comes with a major drawback: hotspots. The reason why a hotspot
is created is due to uneven distribution of data across a cluster. If you have a cluster with
Search WWH ::




Custom Search