Enumerating the Search Space - Automated Physical Database Design and Tuning

Database Reference

In-Depth Information

C to a new candidate C tr . We can eciently estimate the expected decrease

in storage space

S tr = size ( C )

−

size ( C tr ) and the maximum increase in cost

T tr = CostBound ( C tr )

−

Cost ( C ). The value penalty ( C tr )=

T tr /

S tr esti-

mates the increase in execution cost per unit of storage expected for C tr .

Penalty values are opposed to promise values, and thus we can rank configura-

tions by increasing values of penalty ( C tr ). The reason is that we are interested

in configurations that are significantly smaller in space without increasing

the expected cost too much (note that a slight variation of this heuristic is

used in the greedy approximation to the knapsack problem in Section 6.1.2).

Suppose that the space constraint is B (i.e., we are interested in the best con-

figuration that fits in B ). Any decrease in space beyond B is not strictly useful,

but we might artificially decrease the value of penalty ( C tr ) whenever size ( C tr )

is significantly below B . For that reason, we refine the penalty function for

C tr obtained by transforming C as follows:

T tr

penalty

(

C tr ) =

min (size(

) −

S tr )

6.2.1.5

Pruning and Backtracking

As explained in Figure 6.6, we keep transforming the current configuration

until it is pruned, at which point we backtrack to a previous configuration and

start another iteration. Consider a storage constraint B , and assume a SELECT -

only workload. Suppose that the current configuration C exceeds B , but after

transforming C into C we observe that C is within the storage bound B .In

this case, no matter how we further transform C , we would never obtain a

valid configuration that is more e cient than C . The reason is that all the

transformations (e.g., merges and reductions) result in configurations that are

less ecient for the input workload. Therefore, C would be more ecient than

any remaining configuration obtained by transforming C and also fits in the

storage constraint. We can therefore stop the current iteration and prune C .

After we prune a configuration (or whenever the current configuration cannot

be further transformed), we need to pick a suitable configuration to backtrack

to and continue the search space enumeration. One approach to choose a new

configuration is as follows. We first obtain the chain of transformed config-

urations from the initial to the current one. We then pick the configuration

that resulted in the actual largest penalty when transformed (with the aim

of “correcting” what went wrong in the estimation of penalty values). Other

alternative ways to pick a new configuration include choosing the unpruned

configuration with the smallest cost, the smallest storage surplus, or the best

ratio of such quantities.

6.2.2 Handling Updates

So far we focused on SELECT -only workloads. The main impact of an update

query is that some (or all) of the indexes defined over the table updated by

Automated Physical Database Design and Tuning

Search WWH ::

Custom Search

Home