Database Reference
In-Depth Information
Using the Cassandra bulk loader to restore the data
An alternative technique to load the data to Cassandra is using the sstableloader util-
ity. It can be found under the bin directory of the Cassandra installation. This tool is espe-
cially useful when the number of nodes and the replication strategy is changed, because un-
like the copy method, it streams appropriate data to appropriate nodes, based on the config-
uration.
Assuming that you have the -Index.db and -Data.db files with you, here are the steps
to use sstableloader :
1. Check the node's schema. If it does not have the keyspaces and the column famil-
ies that are being restored, create the appropriate keyspaces and the column famil-
ies.
2. Create a directory with the same name as the keyspace that is being loaded. Inside
this directory, all the column families' data (the .db files) that is being restored
should be kept in a directory with the same name as the column family name. For
example, if you are restoring a myCF column family in keyspace, mykeyspace ,
all mykeyspace-myCF-hf-x-Data.db and mykeyspace-myCF-hf-x-
Index.db (where x is an integer) files should be placed within the directory
structure: mykeyspace/myCF/ .
3. Finally, execute bin/sstableloadermykeyspace .
Cassandra's bulk loader simplified the task to an extent that one can just store the backup in
the exact same directory structure as required by sstableloader , and whenever a res-
toration is required just download the backup directory and execute sstableloader .
It can be observed that the backup step is very mechanical and can easily be automated to
perform a daily backup using the cron job and the shell script. It may be a good idea to
clear the snapshot once in a while, and take a snapshot then on.
Note
Backup
Coming from the traditional database, one thinks that data backup is an essential part of
data management. Data must be backed up daily, stored in a hard disk, and stored in a safe
place. This is a good idea. It gets harder and inefficient to achieve this as the data size
Search WWH ::




Custom Search