Database Reference
In-Depth Information
which the previous query is an example, can be modeled with semijoins (which
can be further transformed in regular joins if unique constraints are satisfied
as in the given example). The problem is more complex when aggregates
are present in the nested subquery, since unnesting requires pulling up the
aggregation without violating the semantics of the nested query, which can
be especially complex in the presence of duplicates and NULL values.
2.1.4 Other Transformations
There are several additional transformations that exploit special cases or com-
mon patterns, some of which we briefly discuss:
Partial preaggregation: In some scenarios, rather than fully pushing a
group-by below a join, we can unfold a group-by clause into a local,
more fine-grained group-by clause, which is pushed down the join, and
a global group-by clause, which performs the final aggregation. Such
staged computation may still be useful in reducing the join costs be-
cause of the data reduction effect of the local aggregate but requires the
aggregate function to satisfy certain properties (e.g., distributivity).
Materialized views: Materialized views are results of views (i.e., queries)
that are stored and used by the optimizer transparently. Given a set of
materialized views and a query, the problem is to optimize the query
while leveraging available materialized views as much as possible. This
problem introduces the challenge of identifying potential reformulations
of the query so that it can use one or more of the materialized views.
Common subexpressions: When intermediate results are defined multiple
times in a query, there are opportunities for avoiding multiple identi-
cal computations. Several specialized techniques leverage this fact to
produce ecient plans that spool and reuse common subexpressions.
Parallelism: When multiple processors are available, each operator in the
query tree can be parallelized by partitioning the input among available
processors and later combining results. Parallelism can reduce the overall
time to process a query but also has overheads in the partitioning and
merging stages.
2.2 Cost Model
As we discussed so far, there are many logically equivalent algebraic expres-
sions associated to a given query and multiple ways to implement each of
these expressions. Even if we ignore the computational complexity of enumer-
ating the space of possibilities, there remains the question of deciding which
Search WWH ::




Custom Search