Database Reference
In-Depth Information
Horizontal scalability
Horizontal scalability refers to the ability to expand the storage and processing capacity of
a database by adding more servers to a database cluster. A traditional single-master data-
base's storage capacity is limited by the capacity of the server that hosts the master in-
stance. If the data set outgrows this capacity, and a more powerful server isn't available, the
data set must be sharded among multiple independent database instances that know noth-
ing of each other. Your application bears responsibility for knowing to which instance a
given piece of data belongs.
Cassandra, on the other hand, is deployed as a cluster of instances that are all aware of
each other. From the client application's standpoint, the cluster is a single entity; the applic-
ation need not know, nor care, which machine a piece of data belongs to. Instead, data can
be read or written to any instance in the cluster, referred to as a node ; this node will for-
ward the request to the instance where the data actually belongs.
The result is that Cassandra deployments have an almost limitless capacity to store and pro-
cess data; when additional capacity is required, more machines can simply be added to the
cluster. When new machines join the cluster, Cassandra takes care of rebalancing the exist-
ing data so that each node in the expanded cluster has a roughly equal share.
Note
Cassandra is one of the several popular distributed databases inspired by the Dynamo ar-
chitecture, originally published in a paper by Amazon. Other widely used implementations
of Dynamo include Riak and Voldemort. You can read the original paper at ht-
tp://s3.amazonaws.com/AllThingsDistributed/sosp/amazon-dynamo-sosp2007.pdf .
Search WWH ::




Custom Search