Database Reference
In-Depth Information
Some DBMSs have the option of avoiding the need for the table. The leaf pages
of one of the indexes then effectively contain the table rows.
In Oracle, this option is called an index-organized table, and the index con-
taining the table rows is called the primary index. In SQL Server, the table rows
are stored in an index created with the option CLUSTERED. In both cases, the
other indexes (called secondary indexes in Oracle and unclustered indexes in
SQL Server) point to the index that contains the table rows.
The obvious advantage of index-only tables is a saving in disk space. In
addition, INSERTs, UPDATEs, and DELETEs are a little faster because there is
one less page to modify.
There are, however, disadvantages relating to the other indexes. If these point
to the table row using a direct pointer (containing the leaf page number), a leaf
page split in the primary (clustered) index causes a large number of disk I/Os
for the other indexes. Any update to the primary index key that moves the index
row, forces the DBMS to update the index rows pointing to the displaced index
row. This is why SQL Server, for instance, now uses the key of the primary
index as the pointer to the clustered index. This eliminates the leaf page split
overhead, but the unclustered indexes become larger if the clustered index has a
long key itself. Furthermore, any access via a nonclustered index goes through
two sets of nonleaf pages; first, those of the unclustered index and then those of
the clustered index. This overhead is not a major issue as long as the nonleaf
pages stay in the buffer pool.
The techniques presented in this topic apply equally well to index-only tables ,
although the diagrams always show the presence of the table. If index-only tables
are being used, the primary (clustered) table should be considered as a clustering
index that is fat for all SELECTs . This last statement may not become clear until
Chapter 4 has been considered. The order of the index rows is determined by the
index key. The other columns are nonkey columns.
Note that in SQL Server the clustered index does not have to be the primary
key index . However, to reduce pointer maintenance, it is a common practice to
choose an index whose key is not updated, such as a primary or a candidate key
index. In most indexes (the nonkey column option will be discussed later), all
index columns make up the key, so it may be difficult to find other indexes in
which no key column is updated.
Page Adjacency
Are the logically adjacent pages (such as leaf page 1 and leaf page 2) physically
adjacent on disk? Sequential read would be very fast if they are (level 2 in
Fig. 2.10).
In some older DBMSs, such as SQL/DS and the early versions of SQL
Server, the pages of an index or table could be spread all over a large file. The
only difference in the performance of random and sequential read was then due
to the fact that a number of logically adjacent rows resided in the same page
(level 1 in Fig. 2.10). Reading the next page required a random I/O. If there are
Search WWH ::




Custom Search