Database Reference
In-Depth Information
constraint to prevent more than a single clustered index per table, but we
omit it for simplicity.
This integer program can be adapted to handle updates by slightly modify-
ing the optimizing function so that it takes into account the negative benefit
of indexes that are updated (i.e., using the traditional separation of an UPDATE
query into an inner SELECT and an UPDATE shell). Once the integer program
is defined, it can be solved exactly or approximately by one of many existing
general purpose solvers.
In order to define an integer problem for a given physical design problem
instance, we need to calculate size(
I j ) and b ik values. While sizes are easy to
approximate, obtaining b ik benefit values is more challenging. Specifically, we
need to obtain the cost of each query under every atomic configuration defined
from the enumeration space. If the enumeration space is moderately large or
contains merged, split, or reduced indexes, there can be a combinatorial explo-
sion in the number of such benefit values (note that integer programming itself
is NP-hard). For small problem instances, however, the integer programming
approach can quickly return the optimal configuration.
6.2 Top-Down Enumeration
The techniques for enumerating the physical design search space discussed in
the previous section rely on the following generic pattern:
1. Identify a set of candidate indexes that speed up each query in isolation.
2. Extend this set by means of index transformations into an enumeration
space.
3. Search the enumeration space for a subset of indexes that satisfies the
space constraint and results in the largest improvement in execution
cost for the workload.
In all previous approaches, steps 2 and 3 are performed separately, and step
3 generally follows a bottom-up strategy that starts with an empty configu-
ration and progressively adds indexes until the space constraint is no longer
satisfied. In Section 4.3 we showed how to (conceptually) characterize the
full enumeration space as the closure of an initial set of candidates under
certain index transformations. This approach suggests a completely different
approach to enumerate the space of configurations. Specifically, in this section
we describe an alternative enumeration scheme that begins with an “optimal”
configuration that is generally too large to fit in the available space and pro-
gressively transforms it into configurations that consume less space and at
the same time are less ecient (we first assume that there are no updates
in the workload and lift this restriction in Section 6.2.2). Conceptually, this
Search WWH ::




Custom Search