Database Reference
In-Depth Information
Chapter 3. Effective CQL
This chapter will introduce you to the underlying data structure of tables in Cassandra.
Let's set some development rules of thumb before we dive into CQL. With CQL 3, Cas-
sandra development team has done a commendable job of almost entirely eliminating any
chance of using an antipattern, and at the same time bringing an interface that is SQL
people friendly.
If you are a developer, this is probably the most important chapter for you. You will get a
sense of things that are possible and not possible when working with Cassandra. You may
also want to refer Chapter 8 , Integration with Hadoop , to understand how to use Cassandra
with various big data technologies such as the Hadoop ecosystem and Spark/Shark.
When dealing with Cassandra, keep the following things in mind:
Denormalize, denormalize, and denormalize : Forget about old school 3NF in
Cassandra; the fewer the network trips, the better the performance. Denormalize
wherever you can for quicker retrieval and let the application logic handle the re-
sponsibility of reliably updating all the redundancies.
Rows are gigantic and sorted : The giga-sized rows (a row can accommodate 2
billion cells per partition) can be used to store sortable and sliceable columns.
Need to sort comments by timestamp? Need to sort bids by quoted price? Put in a
column with the appropriate comparator (you can always write your own compar-
ator).
One row, one machine : Each row stays on one machine. Rows are not sharded
across nodes. So beware of this. A high-demand row may create a hotspot.
From query to model : Unlike RDBMS, where you model most of the tables with
entities in the application and then run analytical queries to get data out of it, Cas-
sandra has no such provision. So you may need to denormalize your model in such
a way that all your queries stay limited to a bunch of simple commands such as
get , slice , count , multi_get , and some simple indexed searches.
Search WWH ::




Custom Search