Beyond Key-Value Lookup - Learning Apache Cassandra

Database Reference

In-Depth Information

The limits of the WHERE keyword

At this point, we've seen that you can look up rows by partition key alone, or by a combin-

ation of a partition key and a clustering column. We can easily imagine other ways to use

WHERE , but it's not as flexible as we might hope.

Restricting by clustering column

In Chapter 3 , Organizing Related Data you learned that any row in a table is uniquely iden-

tified by the combined values of its primary key columns. However, in the case of

user_status_updates , the role of the username column is superfluous for the pur-

poses of uniqueness; since id is a UUID, we know that it alone will uniquely identify the

row on its own.

So, can we skip the username partition key and just look up rows by the id clustering

column? Let's give it a shot:

SELECT * FROM "user_status_updates"

WHERE id = 3f9b5f00-e8f7-11e3-9211-5f98e903bf02;

This query is a syntactically valid CQL, and the WHERE clause does identify an existing

row in the table—specifically, the status update whose body reads Alice Update 1 .

This is not, however, a legal query. Instead, we see an error message as follows:

Recalling our mental model for compound primary key tables, Cassandra organizes status

updates like a key-value store, where the partition key acts as the lookup key. Without the

partition key, Cassandra can't efficiently get to the row(s) you've specified. Instead, it

would simply need to iterate internally over all the rows in the table, looking for rows that

meet the conditions in the query. This sort of full table scan is extremely expensive for any

table of non-trivial size, and Cassandra won't let you perform one unless you indicate with

the ALLOW FILTERING directive appended to the end of the query that you know what

you're getting yourself into.

Search WWH ::

Custom Search

Home