Database Reference
In-Depth Information
Chapter 6. Managing a Cluster - Scaling,
Node Repair, and Backup
As a system grows, an application matures, the cloud infrastructure starts to warn us about
the failure of underlying hardware, or if you get hit by the TechCrunch effect, you may
need to do one of these things: repair, backup, or scale up/down. Alternatively, the manage-
ment might decide to have another data center setup just for the analysis of data (maybe us-
ing Hadoop) without affecting the user's experience for which the data is served from the
existing data center. These tasks are an integral part of a system administrator's day job.
Fortunately, all these tasks are fairly easy in Cassandra, and there is a lot of documentation
available for it.
In this chapter, we will go through Cassandra's built-in DevOps tool and discuss how to
scale a cluster up and shrink it down. We will also see how one can replace a dead node or
just remove it, and let other nodes bear the extra load. Further, we will briefly see backup
and restoration of Cassandra data. We will also observe how a virtual node takes away the
burden of manually rebalancing the cluster, which used to be a source of headache in the
versions before Cassandra Version 1.2. You will still have to balance nodes if you decide to
not use virtual node.
Most of the tasks are mechanical and really simple to automate. It may be a burden to
maintain a large cluster of nodes if you have to do everything by hand and you will make a
mistake.
Note
Note that the default installation of Cassandra Version 1.2 onwards uses virtual nodes, but
as of Cassandra 2.1.x, one may opt out of vnode (which is not a good idea). A major part of
this chapter deals with initial_token , load balancing, token distribution, and token
generation. All of these things are not required if you are using vnodes.
Search WWH ::




Custom Search