Database Reference
In-Depth Information
value. A tombstone is a deletion marker that is required to suppress older data in SSTables until
compaction can run.
There's a related setting called Garbage Collection Grace Seconds. This is the amount of time
that the server will wait to garbage-collect a tombstone. By default, it's set to 864,000 seconds,
the equivalent of 10 days. Cassandra keeps track of tombstone age, and once a tombstone is older
than GCGraceSeconds , it will be garbage-collected. The purpose of this delay is to give a node
that is unavailable time to recover; if a node is down longer than this value, then it is treated as
failed and replaced.
As of 0.7, this setting is configurable per column family (it used to be for the whole keyspace).
Staged Event-Driven Architecture (SEDA)
Cassandra implements a Staged Event-Driven Architecture (SEDA). SEDA is a general architec-
ture for highly concurrent Internet services, originally proposed in a 2001 paper called “SEDA:
An Architecture for Well-Conditioned, Scalable Internet Services” by Matt Welsh, David Culler,
and Eric Brewer (who you might recall from our discussion of the CAP theorem).
NOTE
You can read the original SEDA paper at http://www.eecs.harvard.edu/~mdw/proj/seda .
In a typical application, a single unit of work is often performed within the confines of a
single thread. A write operation, for example, will start and end within the same thread. Cas-
sandra, however, is different: its concurrency model is based on SEDA, so a single operation
may start with one thread, which then hands off the work to another thread, which may hand
it off to other threads. But it's not up to the current thread to hand off the work to another
thread. Instead, work is subdivided into what are called stages, and the thread pool (really, a
java.util.concurrent.ExecutorService ) associated with the stage determines execution.
A stage is a basic unit of work, and a single operation may internally state-transition from one
stage to the next. Because each stage can be handled by a different thread pool, Cassandra ex-
periences a massive performance improvement. This SEDA design also means that Cassandra
is better able to manage its own resources internally because different operations might require
disk IO, or they might be CPU-bound, or they might be network operations, and so on, so the
pools can manage their work according to the availability of these resources.
A stage consists of an incoming event queue, an event handler, and an associated thread
pool. Stages are managed by a controller that determines scheduling and thread allocation;
Cassandra implements this kind of concurrency model using the thread pool
Search WWH ::




Custom Search