Database Reference
In-Depth Information
In the following example, based on the
group_by_placement.sql
script, there's a parent-child relationship
between the two tables, and the child table (
t2
) contains many more rows than the parent table (
t1
):
SELECT t1.n, t2.n, count(*)
FROM t1, t2
WHERE t1.id = t2.t1_id
GROUP BY t1.n, t2.n
When the number of distinct values of
t2.n
is much lower than the number of rows stored in the child table,
group-by placement produces the following query. An additional
GROUP BY
clause is applied to the data of the child
table to eliminate duplicates before joining the parent table. Also notice that in the top-level
SELECT
clause the
count
function is replaced by the
sum
function:
SELECT t1.n, vw_gb.n, sum(vw_gb.cnt)
FROM t1, (SELECT t2.t1_id, t2.n, count(*) AS cnt
FROM t2
GROUP BY t2.t1_id, t2.n) vw_gb
WHERE t1.id = vw_gb.t1_id
GROUP BY t1.n, vw_gb.n
Order-By Elimination
The purpose of
order-by elimination
is to remove superfluous
ORDER BY
clauses from subqueries, inline views, and
regular views. Top-level
SELECT
clauses are obviously not considered by this heuristic-based query transformation.
ORDER BY
clauses can be considered unnecessary when an
ORDER BY
is followed by an operation that doesn't
guarantee that it will return the rows ordered, or that it will return the rows in a different order; for example, another
ORDER BY
or an aggregation.
In the following example, based on the
order_by_elimination.sql
script, there's not only an
ORDER BY
in the
inline view, but also a
GROUP BY
in the top-level query block:
SELECT n2, count(*)
FROM (SELECT n1, n2
FROM t
ORDER BY n1)
GROUP BY n2
Because the
GROUP BY
in the top-level query block doesn't guarantee the order of the returned rows, order-by
elimination removes the
ORDER BY
and produces the following query:
SELECT n2, count(*)
FROM (SELECT n1, n2
FROM t)
GROUP BY n2
The query optimizer, with select list pruning and simple view merging, further transforms the query into the following:
SELECT n2, count(*)
FROM t
GROUP BY n2