Database Reference
In-Depth Information
over multiple columns of the underlying table. In that case, internal
nodes are defined on ranges of composite values from the index columns.
Querying indexes. Consider an index I
=
(
c 1 ,
c 2 ,... ,
c n )
. The two main
querying operations we can perform over such an index are Index Scans
and Index Seeks . Scanning an index locates the left-most leaf node in
the B + -tree and returns the values of the columns in the index without
actually going to the heap or clustered index. Since non-clustered in-
dexes are typically defined over a subset of the table columns, they are
smaller than the corresponding table heap or clustered index. Therefore,
Index Scans are good evaluation alternatives if the query requires only
a subset of columns present in the index pages. The second operator,
Index Seek , provides an ecient way to evaluate predicates of the form
c 1 = v 1 (or, generally, c 1 = v 1 ...
T
c k + 1 v k + 1 ). We can evalu-
ate this predicate eciently by seeking the index, that is, traversing the
tree from the root to the first tuple in a leaf node that satisfies all predi-
cates (if there is such a tuple) and performing the analogous of an Index
Scan as previously described from that point until the index leaf page
where the predicate is no longer satisfied. Unlike binary search trees,
B + trees have very high fan-out (typically in the order of hundreds),
which reduces the number of I/O operations required to find an element
in the tree. Thus, a height of at most five suces for virtually all table
sizes. Therefore, seeking an index can be done using a handful of I/Os,
which is very ecient. Moreover, a large fraction of index internal nodes
might already be resident in memory. If a query requires columns that
are not part of the index definition, a subsequent lookup to the table
heap or clustered index is required. In that case, the RID of each tuple
is used to look up into the table heap (or an additional seek for the case
of clustered indexes). Since tuples in heaps or clustered indexes are not
necessarily laid out in the same order as tuples in secondary indexes,
such lookup typically requires an additional disk I/O, which can quickly
become expensive.
Updating indexes. Each time a tuple is modified, inserted, or deleted, all
relevant indexes are updated as well. For inserts and deletes on a table,
all indexes defined over such table are updated. For an update state-
ment, only those indexes that are defined over an update column are
modified. Thus, INSERT/DELETE/UPDATE statements are the main factor
against wider index adoption (the second drawback being storage con-
straints). An update-intensive query workload may suffer performance
degradation due to the presence of multiple indexes.
Key vs. included columns. Some database systems have the ability to dif-
ferentiate index columns into key columns and included columns . The
main difference between these two types of columns is that included
columns are not part of B + -tree internal nodes but are present in the
leaf nodes. Therefore, we cannot use an index to evaluate predicates that
c k = v k
Search WWH ::




Custom Search