Database Reference
In-Depth Information
be more than one). Third, the resulting expressions obtained by apply-
ing the rules are integrated back into the MEMO , possibly creating new,
unexplored groups (as an example, applying the join associativity rule
to expression 5.1 in Figure 2.7b results in groupExpression 5.2, which
points to a newly created group 7). Finally, each new groupExpression
triggers follow-up tasks, which depend on its type. If it is a logical op-
erator, the optimizer was exploring the group and thus an ExploreExpr
task is scheduled for the new groupExpression . Otherwise, the expres-
sion inputs are optimized, and the cost of the physical plan is calculated
(see OptInputs task).
OptInputs: This task optimizes the inputs of a given physical operator p
and computes the best execution plan rooted at p . For each child p i of p ,
it first calculates the required properties of p i with respect to p and then
schedules an OptimizeGroup task for the group of p i . As an example,
suppose that the root of the tree is a Merge Join operator. Since Merge
Join expects the inputs ordered by the join column, the current task
generates a required sort property for each of the inputs and optimizes
the corresponding groups under this new optimization context. The Opt-
Inputs task also implements a cost-based pruning strategy. Whenever it
detects that the lower bound of the expression that is being optimized is
larger than the cost of an existing solution, it fails and returns no plan
for that optimization goal.
As we can see, the five optimization tasks are nontrivial and depend on
each other. For clarity of exposition, we now present a conceptually simplified
version of the OptimizeGroup task that incorporates portions of the remain-
ing tasks that are relevant in our context. Figure 2.9 shows the algorithm for
OptimizeGroup , which takes as inputs a group G , required properties RP , and
a cost upper-bound UB . OptimizeGroup ( G,RP,UB ) returns the most ecient
physical groupExpression that satisfies RP and is under UB in cost (otherwise,
it returns NULL ). Initially, line 1 checks the winner circle for a previous call
compatible with RP . If we find such plan, we return it (if its cost is below UB ),
or we return NULL otherwise. Otherwise, line 4 iterates over all enumerated
physical groupExpressions in G (note that line 4 encapsulates ExploreGroup ,
ExploreExpr , and ApplyRule ). For each such root groupExpression p , line 5 es-
timates the local cost of candP (i.e., without counting its inputs' costs). Then,
line 8 calculates the input group and required properties for each of candP 's
inputs, and line 9 recursively calls OptimizeGroup to optimize them (note
that the upper bound in the recursive call is decreased to UB - candP.cost ).
After each input is successfully optimized, lines 11 and 12 store the best im-
plementation for each of candP 's children and update its partial cost. If at
any moment the current cost of candP becomes larger than UB , the candidate
is discarded and the next one is considered (line 7). Otherwise, after candP is
completely optimized, line 13 checks whether candP is the best plan found so
Search WWH ::




Custom Search