Introduction - Relational Database Index Design and the Optimizers

Database Reference

In-Depth Information

properly and how to organize the table and what rules to follow to put the right

columns into the index—and often it works. But we still seem to continue to

have performance problems, despite the fact that many of these topics are really

very good, and their authors really know what they are talking about.

Of particular interest to us in this topic is the part of the relational system

(called the SQL optimizer) that decides how to find all the information required in

the most efficient way it can. In an ideal world, we wouldn't even need to know

it exists, and indeed most people are quite happy to leave it that way! Having

made this decision, the optimizer directs scans of indexes and tables to find our

data. In order to understand what's going through the optimizer's mind, we will

also need to appreciate what is involved in these scans.

So what we want to do in this topic is first to try to put ourselves in the

optimizer's place; how it decides what table and index scans should be performed

to process SQL statements as efficiently as possible. Perhaps if we understand why

it might have problems, we could do things differently; not by simply following

a myriad of incredibly complex rules that, even if we can understand them might

or might not apply, but by understanding what it is trying to do.

A major concern that one might reasonably be expected to have on hearing

this is that it would appear to be too complex or even out of the question. But it

is quite surprising how little we really need to understand ; what there is, though,

is incredibly important .

Likewise, perhaps the first, and arguably the most important, difference this

topic has from other topics in its field is that we will not be providing a massive

list of rules and syntax to use for coding SQL and designing tables or even

indexes. This is not a reference topic to show exactly which SQL WHERE

clause should be used, or what syntax should be employed, for every conceivable

situation. If we try to follow a long list of complicated, ambiguous, and possibly

even incomplete instructions, we will be following all the others who have already

trod the same path. If on the other hand we understand the impact of what we are

asking the relational system to undertake, and how we can influence that impact,

we will be able to understand, avoid, minimize, and control the problems being

encountered.

A second objective of this topic is to show how we can use this knowledge

to quantify the work being performed. Only in this way can we truly judge

the success of our index design; we need to be able to use actual figures to

show what the optimizer would think, how long the scans would take, and what

modifications would be required to provide satisfactory performance. But most

importantly, we have to be able to do this quickly and easily ; this in turn means

that it is vital to focus on a few major issues, not on the relatively unimportant

detail under which many people drown. This is key—to focus on a very few,

crucially important issues —and to be able to say how long it would take or how

much it would cost .

We have also one further advantage to offer, which again arises as a result of

focusing on what really matters. For those who may be working with more than

one relational product (even from the same vendor), instead of needing to read

Relational Database Index Design and the Optimizers

Search WWH ::

Custom Search

Home