Database Reference
In-Depth Information
Group-by
[ b 2 , key( A )]
a = b 1
Group-by
[ b 2 ]
A
a = b 1
A
B
B
FIGURE 2.3
Reordering JOIN and GROUP BY clauses.
trees may result in a cheaper query plan, they considerably increase the cost
of enumerating the search space. In general, most systems focus on linear join
sequences and some restricted subsets of bushy join trees. Deferring Cartesian
products may also miss opportunities in some decision support queries where
a single big table joins with multiple small tables. Performing some Cartesian
products among these small tables before joining them with the big table can
result in a significant reduction in cost.
2.1.1.2
Group-By and Join Clauses
In the conceptual evaluation of a SQL query block, the processing of the join
precedes that of the group-by (see Section 1.2). Similarly, a group-by clause in
a subquery in the FROM clause is conceptually evaluated before the joins in the
outer query block. Some algebraic transformations enable commuting group-
by and join operators. The evaluation of a group-by operator can potentially
result in a significant reduction in the number of tuples, since only one tu-
ple is generated for every partition of the relation induced by the group-by.
At the same time, joins might eliminate large portions of an input relation,
resulting in fewer tuples to subsequently group. Therefore, in some cases we
can significantly reduce the cost of a query by reordering join and group-by
clauses.
As an example of such transformations, a group-by clause can be pulled up
above a join, as long as (1) we add a key of the other join relational input
to the set of grouping columns, and (2) the join predicate is not defined over
an aggregated column (or the resulting join is not well formed). Figure 2.3
illustrates this transformation (note that b 2 must include b 1 so that the origi-
nal join is well formed). Conversely, a group-by clause can be pushed below a
join R
S (down to S ) whenever (1) the grouping columns include a key of
R , (2) the columns from S in the join predicate are included in the grouping
columns or derived from them via functional dependencies, and (3) aggregates
are defined in terms of columns in S .
2.1.1.3
Outerjoins
One-sided outerjoins are asymmetric operators in SQL that preserve all of the
tuples of one relation. For instance, a left outer join between tables R and
Search WWH ::




Custom Search