Enumerating the Search Space - Automated Physical Database Design and Tuning

Database Reference

In-Depth Information

Specifically, given the current configuration C , we generate the following con-

figurations:

•

Merging: C

−{

I 1 ,

I 2 }∪{

I 1 ⊕

I 2 }

for each pair of compatible indexes ( I 1 ,

I 2 )in C .

•

Splitting: C

−{

I 1 ,

I 2 }∪{

I 1 ⊗

I 2 }

for each pair of compatible indexes ( I 1 ,

I 2 )in C .

•

−{

}∪{ ρ(

) }

Reduction: C

for each index I in C and prefix K of I .

•

Promotion: C

for each index I in C , where I cl is

the clustered index of I 's table (if defined).

−{

}−{

I cl }∪{ (

) }

•

Deletion: C

} for each index I in C . If the removed index is a

clustered index, it is replaced by the corresponding table heap.

−{

)) ,

where n is the number of indexes in C , and m is the maximum number of

columns in an index in C . In real situations this number is likely to be much

smaller, because indexes are spread throughout several tables (and therefore

merging and splitting are valid for only a subset of cases) and also because

not all reductions need to be considered. To clarify the latter point, consider

index I

The number of transformations for a given configuration C is O(

· (

(

) and the single-query workload:

SELECT a, b, c, d, e

FROM R

WHEREa=10

In this situation, the only useful reduction for the index is I

(

)

, since

any other prefix of I is going to be both larger than I and less ecient for

processing the query.

6.2.1.4

Ranking Candidate Configurations

After generating all candidate configurations from the current one via trans-

formations, we need to rank them in decreasing order of “promise,” so that

better configurations are chosen and explored first. For that purpose, we es-

timate both the expected cost of the workload and the expected size (i.e.,

the storage constraint) of each resulting configuration. While estimating sizes

can be done eciently, estimating costs is more challenging. The reason is

that often there are several candidate configurations to rank, and the cost

of performing what-if optimization calls (even when using the optimizations

described in Sections 5.3 and 5.4) is too costly. To reduce overheads, we can

use the upper-bound approach of Section 5.2.2 and obtain upper bounds on

the cost of queries for each candidate transformation.

Once we obtain estimates for both the optimizing function and the devia-

tion from the storage constraint for each of the alternative configurations, we

need to put together these values to rank the different candidate transforma-

tions. Consider some transformation that goes from the current configuration

Automated Physical Database Design and Tuning

Search WWH ::

Custom Search

Home