Database Reference
In-Depth Information
A distributed database
In computing, distributed means splitting data or tasks across multiple machines. In the
context of Cassandra, it means that the data is distributed across multiple machines. It
means that no single node (a machine in a cluster is usually called a node ) holds all the
data, but just a chunk of it. It means that you are not limited by the storage and processing
capabilities of a single machine. If the data gets larger, add more machines. If you need
more parallelism (ability to access data in parallel/concurrently), add more machines. This
means that a node going down does not mean that all the data is lost (we will cover this is-
sue soon).
If a distributed mechanism is well designed, it will scale with a number of nodes. Cas-
sandra is one of the best examples of such a system. It scales almost linearly, with regard to
performance, when we add new nodes. This means that Cassandra can handle the be-
hemoth of data without wincing.
Note
Check out an excellent paper on the NoSQL database comparison titled, Solving Big Data
Challenges for Enterprise Application Performance Management at http://vldb.org/pvldb/
vol5/p1724_tilmannrabl_vldb2012.pdf .
Search WWH ::




Custom Search