Indexing for Table Joins - Relational Database Index Design and the Optimizers

Database Reference

In-Depth Information

table also happened to have the lower number of local rows, but obviously,

this is not always the case.

2. Very small tables . Very small tables and their indexes may stay in the

database buffer pool all day long—or at least no page would be read

more than once during a join. Random touches to such tables and indexes

take less than 0.1 ms, at least after the first touch per page; so when the

table is not the outer table, a large number of random touches to them is

not a problem.

3. Clustering ratio . The correlation between the order of the index rows and

the table rows (100% for a clustering index after a table reorganization)

may affect the choice of the best table access order, unless, of course, the

indexes are fat.

The best cost-based optimizers take these factors into account when they

choose the table access order. Therefore, they may find a better sequence than

the one given by the rule of thumb based on the number of local rows.

The reader may wonder why the rule of thumb uses the maximum filter

factors for the local predicates. This is simply because it is probably too time con-

suming to include the filter factor pitfall. This simplification certainly increases

the risk of inappropriate recommendations where one alternative avoids a sort

and the program does not fetch the whole result in one transaction. Neverthe-

less, indexing based on an assumed table access order may lead to adequate

performance even if the table order access is not optimal.

When designing indexes for a join, where should we start? In theory, we

should first create the best indexes for all table access order variations and then

let the optimizer choose. Then we could remove any unused indexes and index

columns. However, this would be too time consuming, particularly if there are

more than two tables in the join—and the choice made by the optimizer will still

depend on the input values. The index design tools that use the optimizer solve

the first issue but not the second one. The quality of their proposals depends on

the filter factor assumptions, which in turn depend on the input affecting the host

variables in the predicates.

Fortunately, the rule of thumb leads to a satisfactory result in most

cases—and often we do not even need to calculate the number of local rows

because the best table access order is obvious.

The important issue is to keep the assumed join method and table access

order in mind (or probably on the wall) when designing indexes for a join.

Then the indexes are designed just as if the program had been coded with

several single-table cursors instead of a join cursor. It is a common mistake

to design indexes for a join without a consistent access path assumption with

regard to the join method and table access order. It is not sufficient merely to

have indexes for both the join predicates and the local predicates; actually this

approach often results in a few redundant indexes. It is often essential that the

inner table in a nested-loop join has a good fat index, starting with the join

predicate column.

Relational Database Index Design and the Optimizers

Search WWH ::

Custom Search

Home