Deriving the Ideal Index for a SELECT - Relational Database Index Design and the Optimizers

Database Reference

In-Depth Information

Would one of these provide acceptable response times, even with the most com-

mon last name (filter factor 1%)?

With the first alternative, the DBMS would scan the index slice defined by

predicate LNAME

:LNAME. For each index row in the slice, the DBMS must

check the CITY value in the table. As the table rows are clustered by CNO and

not by LNAME, this would probably require one random read from the disk. With

the most common LNAME, obtaining the whole result would mean examining

10,000 index rows and 10,000 table rows, regardless of the filter factor for CITY.

How long would this take?

Let us assume that the size of index (LNAME, FNAME)—including system

data and distributed free space—is 1 , 000 , 000 × 100 bytes = 100 MB. Let us

also assume that sequential read speed is 40 MB/s. Reading a 1% slice, 1 MB,

then takes

=

10 ms + 1MB / 40 MB/s = 35 ms

This is obviously not a problem, but the 10,000 random reads would take

10 , 000 × 10 ms = 100 s

making this alternative far too slow.

With the second alternative, only the first page requires a random read. If the

size of the table—including distributed free space—is 1

,

000

,

000

×

600 bytes

=

600 MB, the I/O time would be

10 ms + 600 MB / 40 MB/s = 15 s, still far too long

The CPU time would be considerably higher with the second alternative

because the DBMS must examine 1,000,000 rows instead of 20,000, and the

result rows must be sorted. On the other hand, the CPU time and the I/O time

overlap, thanks to sequential reads. A full table scan would be faster than the

inadequate index in this case, but it is not fast enough; a better index would

be required.

THREE-STAR INDEX—THE IDEAL INDEX FOR A SELECT

After discussing a very inadequate index for CURSOR41 in the previous section,

let's go to the other extreme, a three-star index , which is by definition the best

possible index for a given SELECT. A single table SELECT using a three-star

index, such as the one shown in Figure 4.2, normally needs only one random disk

drive read and a scan of a thin slice of an index. Therefore the response times are

often a couple of orders of magnitude shorter than those with a mediocre index.

The elapsed time for CURSOR41, shown again in SQL 4.2, is a fraction of

a second, even if the result table has 1000 rows. How is that possible? Figure 4.2

depicts the lowest level of the index, the leaf pages. The upper levels (nonleaf

pages) will be discussed later.

Relational Database Index Design and the Optimizers

Search WWH ::

Custom Search

Home