Table and Index Organization - Relational Database Index Design and the Optimizers

Database Reference

In-Depth Information

ORing (covered in Chapters 6 and 10) bitmap indexes is very fast, even when

there are hundreds of millions of table rows. The corresponding operation with

B-tree indexes requires collecting a large number of pointers and sorting large

pointer sets.

On the other hand a B-tree index, containing the appropriate columns, elimi-

nates table access. This is important because random I/Os to a large table are very

slow (about 10 ms). With a bitmap index, the table rows must be accessed unless

the SELECT list contains only COUNTs. Therefore, the total execution time using

a bitmap index may be much longer than with a tailored, (fat) B-tree index.

Bitmap indexes should be used when the following conditions are true:

1. The number of possible predicate combinations is so large that designing

adequate B-tree indexes is not feasible.

2. The simple predicates have a high filter factor (considered in Chapter 3),

but the compound predicate (WHERE clause) has a low filter factor—or

the SELECT list contains COUNTs only.

3. The updates are batched (no lock contention).

Hashing

Hashing—or randomizing—is a fast way to retrieve a single table row whose

primary key value is known. When the row is stored, the table page is chosen by

using a randomizer, which converts the primary key value into a page number

between 1 and N. If that page is already full, the row is placed in another page,

chained to the home page. When a SELECT ... WHERE PK = :PK is issued,

the randomizer is used again to determine the home page number. The row is

either found in that page or by following the chain that starts on that page.

Randomizers were commonly used in nonrelational DBMSs such as IMS

and IDMS. When the area size (N) was right—corresponding to about 70%

space utilization, the number of I/Os to retrieve a record could have been as low

as 1.1, which was very low compared to an index (a three-level index at that

time could require two I/Os—plus a third to access the record itself). However,

the space utilizations required constant monitoring and adjustments. When many

records were added, the overflow chains grew and the number of I/Os increased

dramatically. Furthermore, range predicates were not supported. Oracle provides

an option for the conversion of a primary key value to a database page number

by hashing.

Many Meanings of Cluster

Cluster is a term that is widely used throughout relational literature. It is also a

source of much confusion because its meaning varies from product to product.

In DB2 (z/OS, LUW, VM, and VSE), a clustering index refers to an index

that defines the home page for a table row being inserted. An index is clustered if

there is a high correlation between the order of the index rows and the table rows.

Search WWH ::

Custom Search

Home