Deployment and Provisioning - Practical Cassandra

Database Reference

In-Depth Information

Byte Ordered

The ByteOrderedPartitioner was one of the first available in Cassandra. While

it does have some advantages, it is not recommended to use this. The

ByteOrderedPartitioner is used for ordered partitioning of data. This is achieved

by ordering the row lexically by the key bytes. Tokens are calculated using the hex

representation of the leading character in a key.

The main advantage to using an ordered partitioner is that you can do scans by

primary key. This means that Cassandra will be capable of doing range scans. For

example, you can say, “Show me all users in my database who have a last name

between Bradberry and Lubow.” The reason this type of query isn't possible with

one of the random partitioners is that the token is a hashed value of the key and

there is no guarantee of sequence. Even though this all seems like a great idea, you

can use secondary indexes to achieve the same thing and avoid a lot of the conse-

quences of using an ordered partitioner.

There are two major cons to using the ordered partitioner: poor load balancing

and hot spots. Although it is entirely possible to get hot spots on the random par-

titioners as well, your chances increase with ordered partitioners. While this is ap-

plication dependent, most applications tend to write sequentially or heavily favor

certain types of data like timestamps or similar last names. If this is the case for

your application, many of the reads and writes will go to the same few nodes and

cause hot spots. These types of hot spots will also cause trouble when attempting

to load-balance your data. One table being balanced across the cluster does not

mean that another table will be balanced across the cluster. This means regularly

recalculating and rebalancing your partition ranges. Not only is this an additional

administrative headache to deal with, but it is also easily avoidable by choosing a

random partitioner and being intelligent about query patterns and using secondary

indexes.

Random Partitioners

A random partitioner is a partitioner that distributes data in a random but consist-

ent fashion around the cluster. Both the RandomPartitioner and the Murmur3Parti-

tioner are examples of random partitioners. Unlike ordered partitioners such as the

ByteOrderedPartitioner, random partitioners create a hashing function that makes

the distribution of data appear more random. As of Cassandra 1.2, there are two

types of random partitioners: RandomPartitioner and Murmur3Partitioner.

Search WWH ::

Custom Search

Home