Database Reference
In-Depth Information
Would one of these provide acceptable response times, even with the most com-
mon last name (filter factor 1%)?
With the first alternative, the DBMS would scan the index slice defined by
predicate LNAME
:LNAME. For each index row in the slice, the DBMS must
check the CITY value in the table. As the table rows are clustered by CNO and
not by LNAME, this would probably require one random read from the disk. With
the most common LNAME, obtaining the whole result would mean examining
10,000 index rows and 10,000 table rows, regardless of the filter factor for CITY.
How long would this take?
Let us assume that the size of index (LNAME, FNAME)—including system
data and distributed free space—is 1 , 000 , 000 × 100 bytes = 100 MB. Let us
also assume that sequential read speed is 40 MB/s. Reading a 1% slice, 1 MB,
then takes
=
10 ms + 1MB / 40 MB/s = 35 ms
This is obviously not a problem, but the 10,000 random reads would take
10 , 000 × 10 ms = 100 s
making this alternative far too slow.
With the second alternative, only the first page requires a random read. If the
size of the table—including distributed free space—is 1
,
000
,
000
×
600 bytes
=
600 MB, the I/O time would be
10 ms + 600 MB / 40 MB/s = 15 s, still far too long
The CPU time would be considerably higher with the second alternative
because the DBMS must examine 1,000,000 rows instead of 20,000, and the
result rows must be sorted. On the other hand, the CPU time and the I/O time
overlap, thanks to sequential reads. A full table scan would be faster than the
inadequate index in this case, but it is not fast enough; a better index would
be required.
THREE-STAR INDEX—THE IDEAL INDEX FOR A SELECT
After discussing a very inadequate index for CURSOR41 in the previous section,
let's go to the other extreme, a three-star index , which is by definition the best
possible index for a given SELECT. A single table SELECT using a three-star
index, such as the one shown in Figure 4.2, normally needs only one random disk
drive read and a scan of a thin slice of an index. Therefore the response times are
often a couple of orders of magnitude shorter than those with a mediocre index.
The elapsed time for CURSOR41, shown again in SQL 4.2, is a fraction of
a second, even if the result table has 1000 rows. How is that possible? Figure 4.2
depicts the lowest level of the index, the leaf pages. The upper levels (nonleaf
pages) will be discussed later.
Search WWH ::




Custom Search