Databases Reference
In-Depth Information
Obviously, again the index lookup is a preferred choice. Although the topic index and database
index are analogous, stretching the similarity too far can cause confusion. Book indexes are on free
text and so the number of words or terms indexed is restricted to an important subset of the entire
possible set. On the other hand, database indexes apply to all data sets in a collection. Indexes are
created on an item identifi er or a specifi c property.
ESSENTIAL CONCEPTS BEHIND A DATABASE INDEX
There is no single universal formula for creating an index but most useful methods hinge on a
few common ideas. The building blocks of these common ideas reside in hash functions and
B-tree and B+-tree data structures. In this section you peruse these ideas to understand the
underlying theory.
A hash function is a well-defi ned mathematical function to convert a large, and often variable-sized
and complex, data value to a single integer or a set of bytes. The output of a hash function is known
by various names, including hash code, hash value, hash sum, and checksum. A hash code is often
used as the key for an associative array, also called a hash map. Hash functions come in handy when
you are mapping complex database property values to hash codes for index creation.
A tree data structure distributes a set of values in a tree-like structure. Values are structured in a
hierarchical manner with links or pointers between certain nodes in the tree. A binary tree is a tree
that has at most two child nodes: one on the left and other on the right. A node can be a parent,
in which case it has at most two nodes, or it can be a leaf, in which case it's the last node in the
chain. At the base of a tree structure is a root node. See Figure 8-1 to understand a binary tree
data structure.
root node
5
left node
right node
4
6
leaf node
1
3
5
FIGURE 8-1
A B-tree is a generalization of a binary tree. It allows more than two child nodes for a parent
node. A B-tree keeps the data sorted and therefore allows for effi cient search and data access.
A B+-tree is a special variant of a B-tree. In a B+-tree, all records are stored in the leaves and
leaves are sequentially linked. A B+-tree is the most common tree structure used to store a
database index.
Search WWH ::




Custom Search