Database Reference
In-Depth Information
family, and so on. In this way, a column family is somewhat analogous to a table in the relational
world.
Putting this all together, we have the basic Cassandra data structures: the column, which is a
name/value pair (and a client-supplied timestamp of when it was last updated), and a column
family, which is a container for rows that have similar, but not identical, column sets.
In relational databases, we're used to storing column names as strings only—that's all we're al-
lowed. But in Cassandra, we don't have that limitation. Both row keys and column names can be
strings, like relational column names, but they can also be long integers, UUIDs, or any kind of
byte array. So there's some variety to how your key names can be set.
This reveals another interesting quality to Cassandra's columns: they don't have to be as simple as
predefined name/value pairs; you can store useful data in the key itself, not only in the value. This
is somewhat common when creating indexes in Cassandra. But let's not get ahead of ourselves.
Now we don't need to store a value for every column every time we store a new entity. Maybe
we don't know the values for every column for a given entity. For example, some people have a
second phone number and some don't, and in an online form backed by Cassandra, there may be
some fields that are optional and some that are required. That's OK. Instead of storing null for
those values we don't know, which would waste space, we just won't store that column at all for
that row. So now we have a sparse, multidimensional array structure that looks like Figure 3-3 .
Figure3-3.A column family
It may help to think of it in terms of JavaScript Object Notation (JSON) instead of a picture:
Musician: ColumnFamily 1
bootsy: RowKey
email: bootsy@pfunk.com, ColumnName:Value
Search WWH ::




Custom Search