Database Reference
In-Depth Information
ORing (covered in Chapters 6 and 10) bitmap indexes is very fast, even when
there are hundreds of millions of table rows. The corresponding operation with
B-tree indexes requires collecting a large number of pointers and sorting large
pointer sets.
On the other hand a B-tree index, containing the appropriate columns, elimi-
nates table access. This is important because random I/Os to a large table are very
slow (about 10 ms). With a bitmap index, the table rows must be accessed unless
the SELECT list contains only COUNTs. Therefore, the total execution time using
a bitmap index may be much longer than with a tailored, (fat) B-tree index.
Bitmap indexes should be used when the following conditions are true:
1. The number of possible predicate combinations is so large that designing
adequate B-tree indexes is not feasible.
2. The simple predicates have a high filter factor (considered in Chapter 3),
but the compound predicate (WHERE clause) has a low filter factor—or
the SELECT list contains COUNTs only.
3. The updates are batched (no lock contention).
Hashing
Hashing—or randomizing—is a fast way to retrieve a single table row whose
primary key value is known. When the row is stored, the table page is chosen by
using a randomizer, which converts the primary key value into a page number
between 1 and N. If that page is already full, the row is placed in another page,
chained to the home page. When a SELECT ... WHERE PK = :PK is issued,
the randomizer is used again to determine the home page number. The row is
either found in that page or by following the chain that starts on that page.
Randomizers were commonly used in nonrelational DBMSs such as IMS
and IDMS. When the area size (N) was right—corresponding to about 70%
space utilization, the number of I/Os to retrieve a record could have been as low
as 1.1, which was very low compared to an index (a three-level index at that
time could require two I/Os—plus a third to access the record itself). However,
the space utilizations required constant monitoring and adjustments. When many
records were added, the overflow chains grew and the number of I/Os increased
dramatically. Furthermore, range predicates were not supported. Oracle provides
an option for the conversion of a primary key value to a database page number
by hashing.
Many Meanings of Cluster
Cluster is a term that is widely used throughout relational literature. It is also a
source of much confusion because its meaning varies from product to product.
In DB2 (z/OS, LUW, VM, and VSE), a clustering index refers to an index
that defines the home page for a table row being inserted. An index is clustered if
there is a high correlation between the order of the index rows and the table rows.
Search WWH ::




Custom Search