The Cassandra Data Model - Cassandra: The Definitive Guide

Database Reference

In-Depth Information

in the Cassandra world. When you want to query for users in a city, you just query the UserCity

column family, instead of querying the User column family and doing a bunch of pruning work

on the client across a potentially large data set.

Note that in this context, “materialized” means storing a full copy of the original data so that

everything you need to answer a query is right there, without forcing you to look up the original

data. If you are performing a second query because you're only storing column names that you

use, like foreign keys in the second column family, that's a secondary index.

NOTE

As of 0.7, Cassandra has native support for secondary indexes.

Valueless Column

Let's build on our User / UserCity example. Because we're storing the reference data in the User

column family, two things arise: one, you need to have unique and thoughtful keys that can en-

force referential integrity; and two, the columns in the UserCity column family don't necessarily

need values. If you have a row key of Boise, then the column names can be the names of the

users in that city. Because your reference data is in the User column family, the columns don't

really have any meaningful value; you're just using it as a prefabricated list, but you'll likely want

to use values in that list to get additional data from the reference column family.

Aggregate Key

When you use the Valueless Column pattern, you may also need to employ the Aggregate Key

pattern. This pattern fuses together two scalar values with a separator to create an aggregate. To

extend our example further, city names typically aren't unique; many states in the US have a city

called Springfield, and there's a Paris, Texas, and a Paris, Tennessee. So what will work better

here is to fuse together the state name and the city name to create an Aggregate Key to use in

our Materialized View. This key would look something like: TX:Paris or TN:Paris . By con-

vention, many Cassandra users employ the colon as the separator, but it could be a pipe character

or any other character that is not otherwise meaningful in your keys.

Some Things to Keep in Mind

Let's look briefly at a few things to keep in mind when you're trying to move from a relational

mindset to Cassandra's data model. I'll just say it: if you have been working with relational data-

bases for a long time, it's not always easy. Here are a few pointers:

Search WWH ::

Custom Search

Home