Database Reference
In-Depth Information
from the node for offline storage or anything else for which you may want to use
a backup.
There is one major “gotcha” with snapshots; if there is data in the CommitLog,
it will not make it into the snapshot. If you want the current data in the CommitLog
to be in the snapshot, make sure you run a
nodetool flush
prior to starting
the snapshot. As you recall, this will move all the data from the CommitLog into
its proper ColumnFamily directories on disk.
Taking Snapshots
Taking a snapshot in Cassandra is a straightforward process that is easily ac-
complished with
nodetool
. Let's start by creating an unnamed snapshot of the
events
ColumnFamily in our keyspace
MainKeyspace
. This is achieved by
running the command shown in
Listing 7.12
.
Listing 7.12
Snapshot of
events
ColumnFamily from
MainKeyspace
$ nodetool snapshot MainKeyspace -cf events
Requested snapshot for: MainKeyspace and column
family: events
Snapshot directory: 1361544837468
Running this command will accomplish a few things. First, it will perform what
is basically a direct file copy of the
events
ColumnFamily into the snapshots
subdirectory of the ColumnFamily. The directory created in the previous example
is the timestamp in milliseconds at the time the snapshot was taken. The files in
this new directory will exactly match all the files in the ColumnFamily directory
at the time the snapshot was executed.
Note that it is also possible (and more common) to use the naming feature of
nodetool snapshot
(
-t
) and have the snapshots use a naming convention.
For example, if the snapshot in
Listing 7.12
were being scripted into a backup sys-
tem, it would make sense to include the ColumnFamily name, the date and time of
the snapshot, and the machine it came from (see
Listing 7.13
)
.
Listing 7.13
Named Snapshot of
events
ColumnFamily from
MainKey-
space