Database Reference
In-Depth Information
name within the row. In Cassandra, the columns in the rows are sorted by this unique
column name. Also, since the number of partitions is allowed to be very large (1.7*1038),
it distributes the rows almost uniformly across all the available machines by dividing the
rows in equal token groups. Tables or column families are contained within a logical con-
tainer or name space called keyspace. A keyspace can be assumed to be more or less sim-
ilar to database in RDBMS.
Note
A word on max number of cells, rows, and partitions
A cell in a partition can be assumed as a key-value pair. The maximum number of cells
per partition is limited by the Java integer's max value, which is about 2 billion. So, one
partition can hold a maximum of 2 billion cells.
A row, in CQL terms, is a bunch of cells with predefined names. When you define a table
with a primary key that has just one column, the primary key also serves as the partition
key. But when you define a composite primary key, the first column in the definition of
the primary key works as the partition key. So, all the rows (bunch of cells) that belong to
one partition key go into one partition. This means that every partition can have a maxim-
um of X rows, where X = (2*109/ number_of_columns_in_a_row ). Essentially,
rows * columns cannot exceed 2 billion per partition.
Finally, how many partitions can Cassandra hold for each table or column family? As we
know, column families are essentially distributed hashmaps. The keys or row keys or par-
tition keys are generated by taking a consistent hash of the string that you pass. So, the
number of partitioned keys is bounded by the number of hashes these functions generate.
This means that if you are using the default Murmur3 partitioner (range -263 to +263), the
maximum number of partitions that you can have is 1.85*1019. If you use the Random
partitioner, the number of partitions that you can have is 1.7*1038.
Search WWH ::




Custom Search