Database Reference
In-Depth Information
Cost model: The cost model assigns a cost value to every point in the search
space through estimation of the resources consumed by the plan. The
quality of the plan returned by the optimizer is only as good as its cost
model.
Enumeration strategy: The search space can be characterized without
specifying a strategy to generate all candidate plans. The enumeration
strategy serves this purpose, and its goal is to traverse all interesting
plans in the search space eciently (note that regions in the search
space that are guaranteed to be suboptimal for an input query can be
skipped by the enumeration strategy).
The task of an optimizer is nontrivial given the very large number of execu-
tion plans for an input query, the large variance in response time of the plans
in the search space, and the diculty of accurately estimating costs of plans.
A desirable optimizer is one for which (1) the search space includes plans with
low cost, (2) the cost model is accurate, and (3) the enumeration strategy is
ecient. In this chapter we describe each of these aspects in detail.
2.1 Search Space
The search space of an optimizer depends on the set of physical operators sup-
ported by the database engine and the set of algebraic equivalences among exe-
cution subplans. Often, the main source of diversity within an optimizer arises
from the supported set of equivalence-preserving algebraic transformations.
2.1.1 Operator Reordering
An important class of transformations exploits the commutativity property
among operators. We next describe three examples of such transformations.
2.1.1.1
Join Reordering
Optimizers can restrict the allowed sequences of join operations to limit the
search space and therefore improve optimization time. For example, certain
optimizers allow only linear sequences of join operations and defer Cartesian
products until after all joins are processed. However, since join operations are
commutative and associative, the sequence of joins in an operator tree does
not need to be linear. Specifically, a query joining relations R 1 , R 2 , R 3 , and
R 4 can be algebraically represented and evaluated as
(
R 1
R 2 ) (
R 3
R 4 )
,
where the symbol
represents the join operator. Such query trees are called
bushy and typically require materializing intermediate relations. While bushy
Search WWH ::




Custom Search