Database Reference
In-Depth Information
Another situation where a cleanup might be used is when moving tokens
around. In Cassandra 1.1 and earlier or when vnodes are not used, this would
mean changing what tokens are assigned to a particular node or nodes. In Cas-
sandra 1.2 and later, when vnodes are in use, you will need to run a cleanup if the
num_tokens
setting is modified. The reason the cleanup gets run when tokens
are moved is that the keys that no longer belong to the node need to be removed.
upgradesstables
and
scrub
The reason
nodetool upgradesstables
and
nodetool scrub
are
grouped together is that
upgradesstables
is a subset of
scrub
.
The job of
nodetool upgradesstables
is to rebuild your SSTables.
This can be done as a result of a version upgrade of Cassandra or something
as simple as changing the compression options of a ColumnFamily. When
up-
gradesstables
runs, it rebuilds all the SSTables and discards data that it
deems to be broken.
The additional job that
scrub
does is snapshot your data before rebuilding the
SSTables. Although this is a good first step to take, it also means removing the
snapshot by hand. If you just want to upgrade your SSTables, it is not necessary to
run a
nodetool scrub
.
In the same family of tools as
upgradesstables
and
scrub
, there is a tool
that ships with Cassandra called
sstablescrub
. The job of
sstablescrub
is to fix (throw away) corrupted tables. It was designed to be run while the Cas-
sandra node is stopped. You should attempt to run it around the cluster in a rolling
fashion. The use of
sstablescrub
is typically not necessary and shouldn't be
part of your routine maintenance. It is usually reserved for cases where a
node-
tool scrub
failed.
Compactions
The process of merging more files into fewer files is called compaction. This is
done for a variety of reasons, ranging from freeing up space to validating data on
disk. Compactions are a regular part of working with Cassandra, and the impact
can be minimized if handled properly.