Database Reference
In-Depth Information
Chapter 6
Enumerating the Search Space
In Chapter 4 we characterized the search space for the physical design prob-
lem, and in Chapter 5 we discussed how to approximate the cost of a workload
and required storage for arbitrary configurations. As we showed in Chapter 3,
however, the number of configurations in the search space is very large, which
rules out any exact solution. In this chapter we introduce different heuristic
approaches to enumerate the search space and thus solve the physical design
problem.
Enumeration strategies can be categorized as bottom-up or top-down, with
each one associated with different challenges and advantages. Bottom-up
strategies begin with an empty (or preexisting) configuration and progres-
sively add indexes. This approach can be ecient when available storage is
low, since the best configuration is likely to consist of only a few indexes. In
contrast, a top-down approach begins with a globally optimal configuration,
which generally exceeds the storage bound, and thus it is not valid. It then
progressively refines the configuration until it meets the storage constraints.
Top-down strategies have several key desirable properties, especially when the
storage constraint is not very tight. It remains an interesting open problem
whether hybrid schemes based on specific input characteristics can improve
upon the aforementioned approaches.
6.1 Bottom-Up Enumeration
The bottom-up techniques discussed in this section begin by identifying a
subset of the full set of indexes that can be part of the final configuration,
called the enumeration space . Since the full set of indexes for a given workload
is generally too large to manipulate, the enumeration space implements the
first heuristic to reduce the complexity of the optimization problem. The enu-
meration strategy proper is therefore conducted over the enumeration space.
Figure 6.1 illustrates a high-level overview of a generic bottom-up approach.
The simplest way to define the enumeration space for a given workload is
as the union of candidate indexes for each query in the workload, as described
99
Search WWH ::




Custom Search