Database Reference
In-Depth Information
in the Cassandra world. When you want to query for users in a city, you just query the UserCity
column family, instead of querying the User column family and doing a bunch of pruning work
on the client across a potentially large data set.
Note that in this context, “materialized” means storing a full copy of the original data so that
everything you need to answer a query is right there, without forcing you to look up the original
data. If you are performing a second query because you're only storing column names that you
use, like foreign keys in the second column family, that's a secondary index.
NOTE
As of 0.7, Cassandra has native support for secondary indexes.
Valueless Column
Let's build on our User / UserCity example. Because we're storing the reference data in the User
column family, two things arise: one, you need to have unique and thoughtful keys that can en-
force referential integrity; and two, the columns in the UserCity column family don't necessarily
need values. If you have a row key of Boise, then the column names can be the names of the
users in that city. Because your reference data is in the User column family, the columns don't
really have any meaningful value; you're just using it as a prefabricated list, but you'll likely want
to use values in that list to get additional data from the reference column family.
Aggregate Key
When you use the Valueless Column pattern, you may also need to employ the Aggregate Key
pattern. This pattern fuses together two scalar values with a separator to create an aggregate. To
extend our example further, city names typically aren't unique; many states in the US have a city
called Springfield, and there's a Paris, Texas, and a Paris, Tennessee. So what will work better
here is to fuse together the state name and the city name to create an Aggregate Key to use in
our Materialized View. This key would look something like: TX:Paris or TN:Paris . By con-
vention, many Cassandra users employ the colon as the separator, but it could be a pipe character
or any other character that is not otherwise meaningful in your keys.
Some Things to Keep in Mind
Let's look briefly at a few things to keep in mind when you're trying to move from a relational
mindset to Cassandra's data model. I'll just say it: if you have been working with relational data-
bases for a long time, it's not always easy. Here are a few pointers:
Search WWH ::




Custom Search