Database Reference
In-Depth Information
up all b i values (and B ) in the original problem by the same constant without
changing the problem solution (and thus we assume that b i +
c i holds).
We populate table T i with one tuple with value 0 for col and the remaining
tuples with value 1 for col . Finally, we create a workload W
1
={
Q 1 ,... ,
Q n }
where:
Q i = SELECT c i
FROM T i
WHERE c i =0
and set a storag constraint of B disk pages. Now suppose that we can select
only nonclustered indexes. In such a case, the only candidate indexes that
we consider are I i = T i (
since all queries refer to just one column. All such
indexes are defined over different tables and therefore do not interact with each
other with respect to the queries in W . Consider Q i . If index I i is present, the
optimizer would pick an index-based strategy that seeks tuples where col=0
using a single I/O (since there is a single tuple that satisfies the predicate).
Alternatively, if index I i is not present, the optimizer would scan table T i to
find the tuple satisfying c i = 0 using b i + 1 page I/Os. Therefore, the presence
of index I i results in an improvement of execution time given by b i + 1 1 =
c i )
b i
I/Os. The physical design problem then reduces to finding the best subset of
the original indexes I i , each using c i storage pages and improving the overall
workload cost by a factor of b i . Clearly, the optimal solution for this instance
of the physical design problem directly translates into the optimal solution for
the original 0-1 knapsack problem by selecting the objects o i associated with
tables T i for which an index was chosen.
3.2 Toward Automated Physical Design
In the previous section we showed that even for moderate problem sizes it is
very dicult to manually find the optimal physical database design for a given
workload, and thus automated tools are a necessity. The fact that the physical
design problem is NP-complete tells us that there is little hope for automated
tools finding optimal physical designs except for small problem instances. A
variant of this complexity result was originally proved in the early eighties and
skewed much of the subsequent research in the area toward heuristic search
and approximation algorithms.
Similar to the traditional optimization problem described in Chapter 2,
the physical design problem can be seen as a very complex search problem.
The more general case that additionally considers clustered indexes is slightly more involved,
requiring an additional column col2 and twice as many queries, but we omit such details for
simplicity.
Search WWH ::




Custom Search