Database Reference
In-Depth Information
How Cassandra works
Diving into various components of Cassandra without having any context is a frustrating
experience. It does not make sense why you are studying SSTable, MemTable, and log
structured merge ( LSM ) trees without being able to see how they fit into the functionality
and performance guarantees that Cassandra gives. So first we will see Cassandra's write
and read mechanism. It is possible that some of the terms that we encounter during this dis-
cussion may not be immediately understandable. The terms are explained in detail later in
this chapter.
A rough overview of the Cassandra components is as shown in the following figure:
Main components of the Cassandra service
The main class of Storage Layer is StorageProxy . It handles all the requests. The mes-
saging layer is responsible for inter-node communications, such as gossip. Apart from this,
process-level structures keep a rough idea about the actual data containers and where they
live.
There are four data buckets that you need to know. MemTable is a hash table-like structure
that stays in memory. It contains actual cell data. SSTable is the disk version of MemT-
ables. When MemTables are full, they are persisted to hard disk as SSTable. Commit log is
an append only log of all the mutations that are sent to the Cassandra cluster.
Note
Mutations can be thought of as update commands. So, insert , update , and delete
operations are mutations, since they mutate the data.
Commit log lives on the disk and helps to replay uncommitted changes. These three are ba-
sically core data. Then there are bloom filters and index. The bloom filter is a probabilistic
data structure that lives in the memory. They both live in memory and contain information
Search WWH ::




Custom Search