Practical Aspects in Physical Database Design - Automated Physical Database Design and Tuning

Database Reference

In-Depth Information

we need some fixed amount of computation to obtain the first solution,

which is then refined over time. After this first solution is found, we can

interrupt the various enumeration strategies and obtain the best con-

figuration found so far. Details naturally depend on each specific tech-

nique. For instance, consider the knapsack-based technique discussed in

Section 6.1. The idea is to first obtain weights and benefits for all in-

dexes in the enumeration space and then greedily pick indexes to form

the initial solution. After the first solution is found, the algorithm tries

variations by slightly modifying the current configuration. This process

can be stopped at any time, returning the best configuration found so

far. The original greedy(m,B) starts from multiple seeds and performs a

partial exhaustive search on each one followed by a hill-climbing step.

This technique can be redesigned to return the best solution found af-

ter evaluating each seed. It is not clear, however, how to provide more

fine-grained capabilities during the execution of greedy(m,B) for a sin-

gle seed. The top-down approach of Section 6.2 begins with an optimal

configuration that is too large and transforms it until the resulting con-

figuration fits in the available space and cannot be further improved. At

that point, the technique backtracks to a previous solution and starts

a new iteration. Therefore, after the first solution is found, we can stop

the search at any moment and return the best solution found so far.

Incremental approach. In the previous approach we cannot interrupt

the enumeration strategy before the first solution has been calculated.

When the workload is large, this step can take a long time in which no

partial solutions are generated. To overcome this limitation, we can ran-

domly partition the workload into pieces and incrementally tune larger

query subsets over time. If we interrupt the algorithm before the whole

workload has been processed, we would get a configuration that is good

for a subset of the original workload. In that sense, the longer the algo-

rithm runs, the more accurate is the solution.

Pause/resume approach. The previous approaches allow a DBA to con-

tinuously monitor the quality of the best configuration found over time

and to interrupt the tuning session whenever the current solution looks

good enough. In some scenarios, it might be useful to pause the tun-

ing process for some time and to resume it later. For instance, suppose

that a DBA has a fraction of time per day to tune workloads, but the

window of opportunity is not large enough to allow a full tuning session

to finish. It is important to have checkpoint mechanisms that allow us

to pause a tuning session and restart it later from the point it was sus-

pended. A common denominator approach to enable such functionality

is to cache the result of optimizing every (configuration, query) pair.

When we suspend a tuning session, we save this information to disk.

At restart time, we read back the cache and rerun the tuning session

without performing optimization calls up to the time it was suspended.

Automated Physical Database Design and Tuning

Search WWH ::

Custom Search

Home