Collections, Tuples, and User-defined Types - Learning Apache Cassandra

Database Reference

In-Depth Information

Collections and secondary indexes

Let's say we'd like to get a list of all of the status updates that alice has starred; we can

display this as part of her user profile. In order to do this, we'd like to be able to look up all

the rows in the user_status_updates table where the starred_by_users

column contains the value alice . This is similar to the use case for a secondary index that

we explored in the Using secondary indexes to avoid denormalization section of Chapter 5 ,

Establishing Relationships , except that, in this case, we'd like to be able to perform a look-

up based on a single value within a collection column.

Happily, it is entirely valid to put a secondary index on a collection column. The syntax for

this is identical to putting an index on any other column:

CREATE INDEX ON "user_status_updates" ("starred_by_users");

So far, so familiar. Now, we'll introduce the CONTAINS operator, which can be used to

look up rows by a value in a collection column:

SELECT * FROM "user_status_updates"

WHERE "starred_by_users" CONTAINS 'alice';

The WHERE…CONTAINS clause will restrict results to those rows whose

starred_by_users set contains the exact value alice . In this case, we'll receive a

single result:

Creating secondary indexes on collections is a powerful tool for relating data—now we can

model many-to-many relationships using a single, normalized data structure. In this case,

the many-to-many relationship is between users and status updates: a user can star many

status updates, and a status update can be starred by many users.

It's worth re-emphasizing the warning in the section Limitations of secondary indexes of

Chapter 5 , Establishing Relationships . In particular, lookup by a secondary index on a col-

Search WWH ::

Custom Search

Home