Deriving the Ideal Index for a SELECT - Relational Database Index Design and the Optimizers

Database Reference

In-Depth Information

had created 60 indexes on one table. Many of these could have been dropped

(and were indeed dropped) to make inserts faster, without making any query

noticeably slower.

It is important to take a close look at the existing indexes after designing

the best possible index (or indexes) for a new SELECT. It is possible that one

of the existing indexes would be almost as good as the ideal index, particularly

if a few columns were to be added after the existing index columns.

When analyzing how useful an existing index would be for a new SELECT,

it should be borne in mind that superfluous indexes fall into three categories:

totally superfluous indexes, practically superfluous indexes, and possibly super-

fluous indexes.

Totally Superfluous Indexes

AS/400 users noticed many years ago that if one query contained WHERE

A = :A AND B = :B while another query had WHERE B = :B AND A = :A,

the DBMS created two indexes: (A, B) and (B, A). One of these would be totally

superfluous assuming no query contained range predicates for column A or col-

umn B. There is no need for two phone books, one sorted by LNAME, FNAME,

the other by FNAME, LNAME if both the last name and the first name are

always known. AS/400 users soon learned to standardize their WHERE clauses,

but the problem reappeared with program generators.

Practically Superfluous Indexes

Assume index (LNAME, CITY, FNAME, CNO) already exists. The ideal index

for a new SELECT consists of 14 columns, starting with the four columns of

the existing index. Should the old index be dropped after creating the 14-column

index? Some DBAs would hesitate to do so because the existing index is unique.

However, it is not a primary key index or a candidate key index. It just happens

to contain column CNO, the primary key. There is no integrity exposure when

additional columns are added to this index. If the DBMS supports nonkey index

columns or constraints to enforce uniqueness, columns may even be added to a

primary key index or any index whose key values must be unique. So, this is

purely a performance issue: Will a SELECT that currently uses the 4-column

index run noticeably slower with the new 14-column index?

How much longer does it take to scan a 10,000-row slice to pick up 1000

index entries if the index row grows, say, from 50 to 200 bytes? The CPU time

increases by an insignificant amount, but the I/O time is proportional to the

number of pages accessed.

CPU time

=

1000

×

0

.

1ms

+

10

,

000

×

0

.

005 ms

=

150 ms

(

1000 FETCH calls and 10

,

000 index rows in both cases

)

Number of 4K leaf pages

(

4 columns

)

:

1

.

5

×

10

,

000

×

50

/

4000

=

200

(

1

.

5 for free space

)

Relational Database Index Design and the Optimizers

Search WWH ::

Custom Search

Home