Indexing for Table Joins - Relational Database Index Design and the Optimizers

Database Reference

In-Depth Information

When all the high-water marks have been refreshed, the average FETCH may

require 1000 sequential touches to the first index but only one touch, normally

random, to the second index. Thus, the QUBE for a transaction with 20 FETCHes

would be:

20 × 10 ms + 20 , 000 × 0 . 01 ms + 20 × 0 . 1ms = 0 . 4s

CostofDenormalization

The biggest performance concern is normally the I/O time required to update

the redundant data added to the table and to one of its indexes. With downward

denormalization, a large number of index rows may have to be moved, which

may make a simple UPDATE very slow. Upward denormalization is not as likely

to cause I/O bursts due to a single update, but many INSERTS, UPDATEs, and

DELETEs may cause a few extra disk I/Os to the parent table and to one of its

indexes. In extreme cases, say more than 10 INSERTs or UPDATEs per second,

the disk drive load created by these I/Os can be an issue.

Nested-LoopJoinandMS/HJVersusDenormalization

It is understandable that many database specialists are reluctant to add redundant

columns to operational tables. Denormalization is not only a trade-off between

retrieval speed and update speed—it is also, to some extent, a trade-off between

performance and data integrity, even if triggers are used to maintain the redundant

data. But then, when nested loop causes too many random reads and MS/HJ

consumes too much CPU time, denormalization may appear to be the only option

available. Nevertheless, before taking this drastic approach, it is obviously of

great importance to ensure that everything that could be done to avoid it, has

been done. This means being absolutely sure that the best fat indexes for NLJ

or MS/HJ have been considered and making any troublesome indexes resident

in memory; the purchase of additional memory to ensure more resident indexes,

or of new disks with fast sequential read, might push the break-even point with

regard to the need to denormalize just that little bit further.

This is without doubt a difficult balancing act. We do not want to give the

impression that non-BJQ is always easy to fix with superfast sequential scans.

The figures might suggest this in our case study, but tables in real life are often

much larger, often containing over 100 million rows and, with high transaction

rates, CPU time is still a very important consideration.

Unconscious Table Design

From the performance point of view, it is more difficult to understand why so

many databases have tables that have a 1:1 or 1:C (C

=

conditional; 0 or 1)

relationship, as shown in Figure 8.23.

Why create four tables instead of one CUST table? Flexibility is not an

issue when the relationship can never become 1:M. In this example a customer

Search WWH ::

Custom Search

Home