Database Reference
In-Depth Information
This allows faster retrieval of records using binary search. Since b-tree keeps data
sorted for faster searching, it would introduce some overhead on insert, update, and de-
lete operations and would require rearranging indexes. B-tree is the preferred data
structure of a larger set of read and writes, that's why it's widely used with distributed
databases.
Clustered Indexes vs. Non-Clustered Indexes
Indexes that are maintained independently from physical rows and don't manage order-
ing of rows are called non-clustered indexes (see Figure 3-1 ). On the other hand,
clustered indexes will store actual rows in sorted order for the index field. Since a
clustered index will store and manage ordering of physical rows, only one clustered in-
dex is possible per table.
The important question is for what scenarios we should use clustered indexes and
non-clustered indexes. For example, a department can be multiple employees (many-
to-one relation) and often is required to read employee details by department. Here de-
partment is a suitable candidate for a clustered index. All rows containing employee
details would be stored and ordered by department for faster retrieval. Here employee
name is a perfect candidate for a non-clustered index and thus we can hold multiple
non-clustered indexes in a table but there will always be a single clustered index per
table.
Index Distribution
With distributed databases, data gets distributed and replicated across multiple nodes.
Retrieval of a data collection would require fetching rows from multiple nodes. Opting
for indexes over a non-row key column would also require being distributed across
multiple nodes, such as shards. Long-running queries can benefit from such shard-
based indexing for fast retrieval of data sets.
Due to peer-to-peer architecture each node in a Cassandra cluster will hold an
identical configuration. Data replication, eventual consistency, and partitioning schema
are two important aspects of data distribution.
Please refer to Chapter 1 for more details about replication factor, strategy class,
and read/write consistency.
Indexing in Cassandra
Search WWH ::




Custom Search