Automated Physical Database Design - Physical Database Design

Databases Reference

In-Depth Information

this model for “what-if ” analysis. Chaudhuri's initial approach as described in the 1997

paper is summarized in the following list, paraphrased directly from the paper:

1.

Run the query-specific-best-configuration algorithm for identifying candidate

indexes. This requires splitting the given workload of n queries into n workloads

of one query each, and finding the best configuration for each query. The union

of these configurations is the candidate index set for step 2. This step is restricted

to single-column indexes only.

2.

Run the greedy ( m, k ) algorithm, where m is the initial seed set of virtual

indexes, and k is a target recommendation size, to enumerate configurations

subject to the single-join atomic configuration pruning technique and select a

set of indexes until the required number of indexes has been picked or total cost

can be reduced no further. We found that m = 2 produced very good solutions.

The single-join atomic configuration considers at most two indexes per table

and at most two tables per query.

3.

Select a set of admissible multicolumn indexes using the assumption that lead-

ing columns of all multicolumn indexes must have appeared in the single-col-

umn solutions found in steps 1 and 2, based on single-column indexes picked

in step 2.

4.

Repeat steps 1 and 2 starting with the admissible index set consisting of single-

column indexes that were chosen in step 2, and multicolumn indexes selected

in step 3.

In 1999 and 2000 IBM published papers on the same topic [Schiefer 1999, Valentin

2000], suggesting a similar approach but with three very significant changes:s:

1.

The selection and ordering of columns within an index can be biased not only by

the columns used in predicates and joins within the statements of the workload,

but also by considering the order in which the query optimizer evaluated these

predicates, which can give rise to a preferred ordering scheme. In a sense, almost

like running the optimizer rules in reverse, candidates columns can be recom-

mended in a more useful sequence.

2.

Instead of limiting the number of indexes to consider in a very highly con-

strained way, IBM proposed to add a much larger set of virtual indexes to be

evaluated on the first pass through the optimizer. This enables a much broader

search through the possible useful indexes. In general only a few of these

indexes will be selected for any given SQL statement. When more than one

index is selected the benefit of each index to the query improvement is hard to

distinguish without recompiling the entire workload, however, this confusion is

amortized over many statements and is not usually a major inhibitor. The bene-

Physical Database Design

Search WWH ::

Custom Search

Home