Database Reference
In-Depth Information
ilar to relational tables) do not need to have matching columns within a row. Even
rows within a ColumnFamily are not required to always follow the same naming
schema. The options are available, but data patterns are not strictly enforced. Data
can also be added in very high volumes at very high velocities, and Cassandra
will determine the correct version of a piece of data by resolving the timestamp at
which it was inserted into the system.
Architecturally, its decentralized nature allows for no single point of failure and
ensures that every node in the cluster has the same role. This means that every
node in the cluster can serve any request. Cassandra also supports replication and
multi-data-center replication. Since replication strategies are configurable, you can
set up your distribution architecture to be as centralized or spread out, or as re-
dundant or fail-safe, as you would like. Because data is automatically replicated to
nodes, downed or faulty nodes are easily replaceable. New nodes can be added at
will, without downtime, to increase read and write throughput or even just availab-
ility. The consistency levels are tunable, which allows you to have the application
enforce the amount of resources applied to data assurance at a transaction level.
Cassandra also has an ecosystem being built around it. There are monitoring
systems like OpsCenter to help you see the health of your cluster and manage com-
mon administration tasks. There are drivers for many of the major languages. Cas-
sandra now comes with integration points for Hadoop and MapReduce support,
full text search with Solr, and Apache Pig and Hive support. There is even a SQL-
like query language called CQL, or Cassandra Query Language, to help in the data
modeling and access patterns.
History of Cassandra
Apache Cassandra was originally developed at Facebook in 2008 to power Face-
book's in-box search feature. The original authors were Avinash Lakshman, who
also is one of the authors of the Amazon Dynamo paper, and Prashant Malik. After
being in production at Facebook for a while, Cassandra was released as an open-
source project on Google Code in July of 2008. In March of 2009, it was accepted
to the Apache Foundation as an incubator project. In February of 2010, it became
a top-level Apache project.
As of the time of this writing, the most recent version of Apache Cassandra is
the 1.2 series. Cassandra has come a long way since the first major release after
its graduation to a top-level Apache project. It has picked up support for Hadoop,
text search integration through Solr, CQL, zero-downtime upgrades, virtual nodes
(vnodes), and self-tuning caches, just to name a few of the major features. Cas-
Search WWH ::




Custom Search