Architecture - Practical Cassandra

Database Reference

In-Depth Information

structure in the cloud or across data centers, you will want to give yourself a little

room for failure and up the setting to 10 or 11.

CommitLogs and MemTables

All write operations in Cassandra first go through the CommitLog. It is mainly be-

cause of the CommitLog that Cassandra can attain such high write performance

results. The CommitLog is so integral to a Cassandra mutation operation that

an operation is not considered successful unless it has been written to the Com-

mitLog. A mutation is any INSERT , UPDATE , or DELETE operation. The reason

Cassandra is so fast about receiving the writes is that all operations are appended

to the CommitLog sequentially. Sequential writes mean there are no disk seeks,

and therefore the entire operation is much faster.

The order for a mutation operation is as follows. First, the operation comes

in over the wire (possibly via CQL, Thrift, or any other means by which you

communicate with Cassandra) and is written to the CommitLog. Once the oper-

ation has been written to disk and has satisfied the data durability requirements

(in other words, this information is now recoverable), it is written to a MemTable.

A MemTable is an in-memory key/value data structure similar to a cache. Each

ColumnFamily has a separate MemTable. MemTables are flushed to disk when the

number of keys in the MemTable exceed a predefined limit (128 keys is the de-

fault) or when the size of the allocated space for MemTables is exceeded.

SSTables

An SSTable is the way that Cassandra stores data on disk. Each SSTable is made

up of five files: a bloom filter file, an index file, a compression file (optional) if

the ColumnFamily data is compressed, a statistics file, and a data file. When each

MemTable is flushed to disk, the following steps are gone through. First, the in-

dex needs to be written. In order to write the index, the columns are sorted by

their row keys. Then the columns are iterated over and the bloom filter is created.

Indexing is done based on the ColumnFamily comparator. Then the data is seri-

alized and written to disk. The data file is written based on the partitioner, hash-

ing algorithm, and compression options. If the data file is written as compressed,

the CompressionInfo file is also written. After the other files have been written to

disk, a ColumnFamilyStatistics file is written. This includes information such as

the number of keys, row and column counts, and data sizes, to name a few items.

Search WWH ::

Custom Search

Home