Table and Index Organization - Relational Database Index Design and the Optimizers

Database Reference

In-Depth Information

Some DBMSs have the option of avoiding the need for the table. The leaf pages

of one of the indexes then effectively contain the table rows.

In Oracle, this option is called an index-organized table, and the index con-

taining the table rows is called the primary index. In SQL Server, the table rows

are stored in an index created with the option CLUSTERED. In both cases, the

other indexes (called secondary indexes in Oracle and unclustered indexes in

SQL Server) point to the index that contains the table rows.

The obvious advantage of index-only tables is a saving in disk space. In

addition, INSERTs, UPDATEs, and DELETEs are a little faster because there is

one less page to modify.

There are, however, disadvantages relating to the other indexes. If these point

to the table row using a direct pointer (containing the leaf page number), a leaf

page split in the primary (clustered) index causes a large number of disk I/Os

for the other indexes. Any update to the primary index key that moves the index

row, forces the DBMS to update the index rows pointing to the displaced index

row. This is why SQL Server, for instance, now uses the key of the primary

index as the pointer to the clustered index. This eliminates the leaf page split

overhead, but the unclustered indexes become larger if the clustered index has a

long key itself. Furthermore, any access via a nonclustered index goes through

two sets of nonleaf pages; first, those of the unclustered index and then those of

the clustered index. This overhead is not a major issue as long as the nonleaf

pages stay in the buffer pool.

The techniques presented in this topic apply equally well to index-only tables ,

although the diagrams always show the presence of the table. If index-only tables

are being used, the primary (clustered) table should be considered as a clustering

index that is fat for all SELECTs . This last statement may not become clear until

Chapter 4 has been considered. The order of the index rows is determined by the

index key. The other columns are nonkey columns.

Note that in SQL Server the clustered index does not have to be the primary

key index . However, to reduce pointer maintenance, it is a common practice to

choose an index whose key is not updated, such as a primary or a candidate key

index. In most indexes (the nonkey column option will be discussed later), all

index columns make up the key, so it may be difficult to find other indexes in

which no key column is updated.

Page Adjacency

Are the logically adjacent pages (such as leaf page 1 and leaf page 2) physically

adjacent on disk? Sequential read would be very fast if they are (level 2 in

Fig. 2.10).

In some older DBMSs, such as SQL/DS and the early versions of SQL

Server, the pages of an index or table could be spread all over a large file. The

only difference in the performance of random and sequential read was then due

to the fact that a number of logically adjacent rows resided in the same page

(level 1 in Fig. 2.10). Reading the next page required a random I/O. If there are

Relational Database Index Design and the Optimizers

Search WWH ::

Custom Search

Home