we need some fixed amount of computation to obtain the first solution,
which is then refined over time. After this first solution is found, we can
interrupt the various enumeration strategies and obtain the best con-
figuration found so far. Details naturally depend on each specific tech-
nique. For instance, consider the knapsack-based technique discussed in
Section 6.1. The idea is to first obtain weights and benefits for all in-
dexes in the enumeration space and then greedily pick indexes to form
the initial solution. After the first solution is found, the algorithm tries
variations by slightly modifying the current configuration. This process
can be stopped at any time, returning the best configuration found so
far. The original greedy(m,B) starts from multiple seeds and performs a
partial exhaustive search on each one followed by a hill-climbing step.
This technique can be redesigned to return the best solution found af-
ter evaluating each seed. It is not clear, however, how to provide more
fine-grained capabilities during the execution of greedy(m,B) for a sin-
gle seed. The top-down approach of Section 6.2 begins with an optimal
configuration that is too large and transforms it until the resulting con-
figuration fits in the available space and cannot be further improved. At
that point, the technique backtracks to a previous solution and starts
a new iteration. Therefore, after the first solution is found, we can stop
the search at any moment and return the best solution found so far.
Incremental approach. In the previous approach we cannot interrupt
the enumeration strategy before the first solution has been calculated.
When the workload is large, this step can take a long time in which no
partial solutions are generated. To overcome this limitation, we can ran-
domly partition the workload into pieces and incrementally tune larger
query subsets over time. If we interrupt the algorithm before the whole
workload has been processed, we would get a configuration that is good
for a subset of the original workload. In that sense, the longer the algo-
rithm runs, the more accurate is the solution.
Pause/resume approach. The previous approaches allow a DBA to con-
tinuously monitor the quality of the best configuration found over time
and to interrupt the tuning session whenever the current solution looks
good enough. In some scenarios, it might be useful to pause the tun-
ing process for some time and to resume it later. For instance, suppose
that a DBA has a fraction of time per day to tune workloads, but the
window of opportunity is not large enough to allow a full tuning session
to finish. It is important to have checkpoint mechanisms that allow us
to pause a tuning session and restart it later from the point it was sus-
pended. A common denominator approach to enable such functionality
is to cache the result of optimizing every (configuration, query) pair.
When we suspend a tuning session, we save this information to disk.
At restart time, we read back the cache and rerun the tuning session
without performing optimization calls up to the time it was suspended.