Enumerating the Search Space - Automated Physical Database Design and Tuning

Database Reference

In-Depth Information

configuration by replacing a subset of its indexes by another one so that the

resulting configuration is smaller at the cost of being generally less ecient.

We then rank the resulting candidate configurations based on their expected

promise and pick the best candidate that is not already in the global cache,

which becomes the current configuration. This cycle repeats until the stopping

condition is met, and we output the best configuration found so far (step 4

in the figure). Note that each time a configuration is evaluated, we check

whether we should prune the current configuration. If so, we retrieve from

the global cache a previously explored configuration that is not pruned and

continue (this step effectively implements a backtracking mechanism). In the

rest of this section we give additional details on each step in Figure 6.6.

6.2.1.1

Choosing the Initial Configuration

Although any configuration can be chosen to be the starting point in the

search, the initial configuration effectively restricts the search space. Specifi-

cally, the enumeration architecture eventually considers any configuration that

is a subset of the closure of the initial configuration under the set of trans-

formations. Therefore, if we consider as the initial configuration the union of

all candidate indexes for each query in the workload, the resulting enumer-

ation space is the same as the full search space as defined in Section 4.3. If

the workload consists of SELECT queries only, this configuration is the optimal

one in absence of storage constraints or results in a lower bound for estimated

cost if there is a storage constraint.

6.2.1.2 Evaluating the Current Configuration

Each search iteration requires evaluating a previously unexplored configu-

ration, which consists of two tasks. First, we need to determine whether the

storage constraint is satisfied and, if not, how close is the current configuration

to a viable state. With a storage constraint of B , we simply estimate the size

of the current configuration, size (

C

) .If size (

C

) ≤

B , the storage constraint

is satisfied. Otherwise, the value size (

B quantifies how close we are to

a valid configuration. Second, we need to evaluate the optimizing function,

that is, the expected cost of the workload under the current configuration.

In order to do so, we need to optimize the queries in the workload in what-if

mode, which returns the expected cost of each query without materializing the

configuration. This step is usually the bottleneck of the whole process, since

optimizer calls are typically expensive. Chapter 5 explores different ways to

reduce this overhead.

C

) −

6.2.1.3

Generating Candidate Configurations

After evaluating the current configuration, we apply transformations to gen-

erate a set of new, unexplored configurations in the search space. For that

purpose, we use the family of transformations introduced in Section 4.2.2.

Automated Physical Database Design and Tuning

Search WWH ::

Custom Search

Home