Establishing Relationships - Learning Apache Cassandra

Database Reference

In-Depth Information

Using secondary indexes to avoid

denormalization

So far, we've exclusively used primary key columns to look up rows—either the full

primary key when we're looking for a specific row, or just the partition key when retrieving

multiple rows in a single partition. We know that these kinds of lookups are very efficient,

because Cassandra can satisfy the query by accessing the single region of storage that holds

the partition's data in order.

This is the motivation for the denormalized follow structure we've built in this chapter:

whether we want to answer the question, "Who does alice follow?", or the question,

"Who follows alice ?", we can construct a query that only needs to access a single parti-

tion. However, we're accepting additional complexity in the form of storing two versions of

the same information, in user_inbound_follows and

user_outbound_follows .

As it happens, Cassandra does provide us with a way to answer both questions in a reason-

ably efficient way using a single table, with a single representation of each follow relation-

ship. By adding a secondary index , we can enable lookup of rows using columns other

than the primary key.

Search WWH ::

Custom Search

Home