Database Reference
In-Depth Information
When additionally considering index-based joins, the analysis of INUM be-
comes more complex. The problem is that some properties in the cost model
of hash- and merge-based join operators are not applicable when considering
index-based joins. For that reason, INUM uses a simplified cost model and
additional optimization calls to be able to handle index-based joins. Although
the number of optimization calls in the preprocessing step might be high,
INUM is able to perform very quick reoptimizations afterward and results in
performance improvements during tuning sessions that explore large numbers
of configurations.
5.4 Configuration-Parametric Query
Optimization (CPQO)
The INUM technique described earlier can be used to infer costs of queries
under arbitrary configurations at a fraction of the cost of a what-if optimiza-
tion call. However, INUM is not fully integrated with the query optimizer.
For that reason, it is not clear how to extend the approach for more complex
execution plans (such as correlated subqueries). INUM also relies on some
assumptions to reduce the number of regular optimization calls per query in
the precomputation phase (which could be exponential in the number of ta-
bles in the worst case). As an example, for TPC-H query 15, INUM requires
over a thousand regular optimization calls for simpler variations of the input
query before it can start optimizing arbitrary configurations. In this section
we present a technique called configuration-parametric query optimization ,or
CPQO for short. The idea is to issue a single optimization call per query
(possibly with a larger overhead than that of a regular optimization call) and
obtain back a compact representation of all possible internal subplans that
allows us to eciently generate execution plans for arbitrary configurations.
Then, the modest overhead during the first (and only) optimization call is
amortized when the same query is reoptimized for different configurations.
We next introduce CPQO in the context of a transformation-based optimizer
such as Cascades (see Chapter 2 for details).
Figure 5.8 shows a high-level, conceptual illustration of CPQO . Consider
the final MEMO structure after optimizing an input query q . If we consider only
physical operators, this final MEMO is a directed acyclic graph, where each node
is a physical operator and edges going out of a node G connect G with its best
children. The rule set in a transformation-based optimizer contains a small
subset that deals with access path selection (see Section 2.3.3.3). Intuitively,
these are the few rules that might make a difference when optimizing the
same query under different configurations. Now suppose that we identify the
subgraphs in the MEMO that were produced by a rule that deals with access
path selection (e.g., P 1 ,... ,
P 4 in Figure 5.8a). Then, everything that is above
Search WWH ::




Custom Search