Database Reference
In-Depth Information
nodetool -h 10.100.1.110 setstreamthroughput 16
&&
nodetool -h 10.100.2.120 setstreamthroughput 16
Backup and Restore
Making backups in Cassandra is a little tricky. The first thing to keep in mind is
that in a distributed system, there is likely more than just a lot of data; there are
a lot of machines on which the data resides. So whatever you choose as a storage
medium for backups, ensure that there is plenty of space.
Are Backups Necessary?
There is some debate as to whether or not backups in a large enough distributed
system are even necessary. While it is good practice to make regular backups, it
may not be a requirement for your system. And if backups are not a major require-
ment, the overall complexity and storage requirements for your architecture can be
drastically reduced.
There are certain situations where you can get away with not having a backup.
But as with any major decision, there are trade-offs. As a reminder, if you have
a replication factor of 3, that means you have a copy of data on a total of three
separate nodes. In Amazon Web Services terminology, if two of those nodes are
in separate availability zones (us-east-1a and us-east-1b) and the third node is in
a different region (us-west-1a), the likelihood of all three nodes in that replica set
failing is rather low. But since there is still a chance, it is a decision that you have
to make based on the data requirements.
Snapshots
The major risk you take with no backup is data problems. And in systems that
are bleeding edge and still in heavy development such as Cassandra, there is al-
ways a possibility of problems. So let's assume that with your architecture and
data set size (or whatever your reasons are) backups are a requirement. In Cas-
sandra, backups are done using snapshots.
When Cassandra data is stored on disk, there are many SSTables per Colum-
nFamily and many files per table. And that is just on a single node containing a
subset of the data. In order to simplify the backup process, the concept of snap-
shots was created. The purpose of a snapshot is to make a copy of some or all of
the data on a node. After the snapshot is created, it can be easily copied or removed
Search WWH ::




Custom Search