Maintenance - Practical Cassandra

Database Reference

In-Depth Information

from the node for offline storage or anything else for which you may want to use

a backup.

There is one major “gotcha” with snapshots; if there is data in the CommitLog,

it will not make it into the snapshot. If you want the current data in the CommitLog

to be in the snapshot, make sure you run a nodetool flush prior to starting

the snapshot. As you recall, this will move all the data from the CommitLog into

its proper ColumnFamily directories on disk.

Taking Snapshots

Taking a snapshot in Cassandra is a straightforward process that is easily ac-

complished with nodetool . Let's start by creating an unnamed snapshot of the

events ColumnFamily in our keyspace MainKeyspace . This is achieved by

running the command shown in Listing 7.12 .

Listing 7.12 Snapshot of events ColumnFamily from MainKeyspace

$ nodetool snapshot MainKeyspace -cf events

Requested snapshot for: MainKeyspace and column

family: events

Snapshot directory: 1361544837468

Running this command will accomplish a few things. First, it will perform what

is basically a direct file copy of the events ColumnFamily into the snapshots

subdirectory of the ColumnFamily. The directory created in the previous example

is the timestamp in milliseconds at the time the snapshot was taken. The files in

this new directory will exactly match all the files in the ColumnFamily directory

at the time the snapshot was executed.

Note that it is also possible (and more common) to use the naming feature of

nodetool snapshot ( -t ) and have the snapshots use a naming convention.

For example, if the snapshot in Listing 7.12 were being scripted into a backup sys-

tem, it would make sense to include the ColumnFamily name, the date and time of

the snapshot, and the machine it came from (see Listing 7.13 ) .

Listing 7.13 Named Snapshot of events ColumnFamily from MainKey-

space

Search WWH ::

Custom Search

Home