Introducing Cassandra - Cassandra: The Definitive Guide

Database Reference

In-Depth Information

to the environment in the form of multiple masters. Because all of the replicas in Cassandra are

identical, failures of a node won't disrupt service.

In short, because Cassandra is distributed and decentralized, there is no single point of failure,

which supports high availability.

Elastic Scalability

Scalability is an architectural feature of a system that can continue serving a greater number of

requests with little degradation in performance. Vertical scaling—simply adding more hardware

capacity and memory to your existing machine—is the easiest way to achieve this. Horizontal

scaling means adding more machines that have all or some of the data on them so that no one

machine has to bear the entire burden of serving requests. But then the software itself must have

an internal mechanism for keeping its data in sync with the other nodes in the cluster.

Elasticscalabilityrefers to a special property of horizontal scalability. It means that your cluster

can seamlessly scale up and scale back down. To do this, the cluster must be able to accept new

nodes that can begin participating by getting a copy of some or all of the data and start serving

new user requests without major disruption or reconfiguration of the entire cluster. You don't

have to restart your process. You don't have to change your application queries. You don't have

to manually rebalance the data yourself. Just add another machine—Cassandra will find it and

start sending it work.

Scaling down, of course, means removing some of the processing capacity from your cluster.

You might have to do this if you move parts of your application to another platform, or if your

application loses users and you need to start selling off hardware. Let's hope that doesn't happen.

But if it does, you won't need to upset the entire apple cart to scale back.

High Availability and Fault Tolerance

In general architecture terms, the availability of a system is measured according to its ability to

fulfill requests. But computers can experience all manner of failure, from hardware component

failure to network disruption to corruption. Any computer is susceptible to these kinds of failure.

There are of course very sophisticated (and often prohibitively expensive) computers that can

themselves mitigate many of these circumstances, as they include internal hardware redundan-

cies and facilities to send notification of failure events and hot swap components. But anyone

can accidentally break an Ethernet cable, and catastrophic events can beset a single data center.

So for a system to be highly available, it must typically include multiple networked computers,

and the software they're running must then be capable of operating in a cluster and have some

facility for recognizing node failures and failing over requests to another part of the system.

Search WWH ::

Custom Search

Home