Databases Reference
In-Depth Information
FIGURE 4.25
Column family representation.
If we wanted to further group column families together to create or manage the relationship
between the column families, the Cassandra model provides a super column family:
Super column family. A super column family is a logical and physical grouping of column
families that can be represented by a single key. The flexibility of this model is you can represent
relationships, hierarchies, and treelike traversal in a simple and flexible manner.
To create a meaningful data structure or architecture, a column family, super column family, or
multiples of the same need to be grouped in one set or under a common key. In Cassandra, a key-
space defines that set of column families grouped under one key. Typically, we can decompose this
as follows: Excel document → sheet 1 → columns/formulas → sheet2 (columns/formulas) → sheet 2
(other columns/formulas), and so on. You can define a keyspace for an application; this is a preferred
approach rather than to create thousands of keyspaces for an application.
A keyspace has configurable properties that are critical to understand:
Replication factor —refers to the number of nodes that can be copies or replicas for each row of
data. If your replication factor is 2, then two nodes will have copies of each row. Data replication
is transparent. The replication factor is the method of controlling consistency within Cassandra
and is a tunable parameter in deciding performance and scalability balance.
Replica placement strategy —refers to how the replicas will be placed in the deployment ring
(we will discuss this in the architecture section). There are two strategies provided to configure
which node will get copies of which keys: SimpleStrategy (defined in the keyspace creation) and
NetworkTopologyStrategy (replications across data centers).
Column families —each keyspace has at least one or more column families. A column family has
configurable parameters, described in Table 4.1 .
As we have learned so far, a keyspace provides the data structure for Cassandra to store the col-
umn families and the subgroups. To store the keyspace and the metadata associated with it, Cassandra
provides the architecture of a cluster, often referred as the ring. Cassandra distributes data to the
nodes by arranging them in a ring that forms the cluster.
 
Search WWH ::




Custom Search