Databases Reference
In-Depth Information
This leads us to another major technical challenge for the Query Optimizer: accurate
cost and cardinality estimation. Since a cost-based optimizer selects the execution plan
with the lowest cost, the quality of the plan selection is only as good as the accuracy of
the optimizer's cost and cardinality estimations. Even supposing that time is not a
concern, and that the query optimizer can analyze the entire search space without a
problem, cardinality and cost estimation errors can still make a query optimizer select
the wrong plan. Cost estimation models are inherently inexact, as they do not consider
all of the hardware conditions, and must necessarily make certain assumptions about
the environment. For example, the costing model assumes that every query starts with a
cold cache (i.e. that its data is read from disk and not from memory) and this assumption
could lead to costing estimation errors in some cases. In addition, cost estimation
relies on cardinality estimation, which is also inexact and has some known limitations,
especially when it comes to the estimation of the intermediate results in a plan. On top of
all that, there are some operations which are not covered by the mathematical model of
the cardinality estimation component, which has to resort to guess logic or heuristics to
deal with these situations. Cardinality and cost estimation will be covered in more detail
in Chapter 3 , Statistics and Cost Estimation .
A historical perspective
We've seen some of the challenges query optimizers still face today, but these
imperfections are not for want of time or research. One of these earliest works
describing a cost-based query optimizer was Access Path Selection in a Relational
Database Management System , published in 1979 by Pat Selinger et al., to describe the
query optimizer for an experimental database management system developed in 1975
at what is now the IBM Almaden Research Center. This database management system,
called "System R," advanced the field of query optimization by introducing the use of cost-
based query optimization, the use of statistics, an efficient method of determining join
orders, and the addition of CPU cost to the optimizer's cost estimation formulae.
Search WWH ::




Custom Search