The Cassandra Data Model - Cassandra: The Definitive Guide

Database Reference

In-Depth Information

family, and so on. In this way, a column family is somewhat analogous to a table in the relational

world.

Putting this all together, we have the basic Cassandra data structures: the column, which is a

name/value pair (and a client-supplied timestamp of when it was last updated), and a column

family, which is a container for rows that have similar, but not identical, column sets.

In relational databases, we're used to storing column names as strings only—that's all we're al-

lowed. But in Cassandra, we don't have that limitation. Both row keys and column names can be

strings, like relational column names, but they can also be long integers, UUIDs, or any kind of

byte array. So there's some variety to how your key names can be set.

This reveals another interesting quality to Cassandra's columns: they don't have to be as simple as

predefined name/value pairs; you can store useful data in the key itself, not only in the value. This

is somewhat common when creating indexes in Cassandra. But let's not get ahead of ourselves.

Now we don't need to store a value for every column every time we store a new entity. Maybe

we don't know the values for every column for a given entity. For example, some people have a

second phone number and some don't, and in an online form backed by Cassandra, there may be

some fields that are optional and some that are required. That's OK. Instead of storing null for

those values we don't know, which would waste space, we just won't store that column at all for

that row. So now we have a sparse, multidimensional array structure that looks like Figure 3-3 .

Figure3-3.A column family

It may help to think of it in terms of JavaScript Object Notation (JSON) instead of a picture:

Musician: ColumnFamily 1

bootsy: RowKey

email: bootsy@pfunk.com, ColumnName:Value

Search WWH ::

Custom Search

Home