Query Optimization in Relational
As discussed in the previous chapter, database management systems (DBMSs)
let users specify queries using high-level declarative languages such as SQL .
Then, the DBMS determines the best way to evaluate the input query and
to obtain the desired result. To answer a SQL query, a typical DBMS goes
through a series of steps, illustrated in Figure 2.1:
1. The input query, treated as a string of characters, is parsed and trans-
formed into an internal data structure (i.e., an algebraic tree). This
step performs both syntactic and semantic checks over the input query,
rejecting all invalid requests.
2. The algebraic tree is optimized and turned into an execution plan. A
query execution plan indicates not only the operations required to eval-
uate the input query but also the order in which they are performed and
the specific algorithms used at each step.
3. The execution plan is evaluated, and results are sent back to the user.
The query optimizer is the component in the DBMS that is responsible
for finding an ecient execution plan to evaluate the input query. For that
purpose, the query optimizer searches a large space of alternatives and chooses
the one that is expected to be least expensive to evaluate.
Although implementation details vary among specific DBMSs, virtually all
optimizers share the same high-level conceptual structure and conduct the
search of execution plans in a cost-based manner. More specifically, optimiz-
ers assign each candidate plan its estimated cost and choose the plan with
the least expected cost for execution. The estimated cost of a plan is defined
as the expected amount of resources that the execution engine would require
to evaluate such plan. Query optimization can therefore be viewed as a com-
plex search problem defined by three interacting components, illustrated in
Search space: The search space characterizes the set of execution plans that
would be considered by the optimizer. For instance, some optimizers
consider only certain subsets of join trees or impose restrictions on the
alternative ways to process subqueries.