Database Reference
In-Depth Information
(a)
(b)
Join
Group 7:
1. Join(2,4)
...
7. HJ(2.3, 4.3)
8. Sort 7.7
Join
Get T
Group 6:
1. Join(1,4)
1. Join(4,1)
...
3. LJ(1.4, 4.3)
Get R
Get S
Group 5:
1. Join(3,4)
2. Join(1,7)
...
6. HJ(4.3, 2.3)
7. MJ(1.5, 7.8)
Group 4:
1. Get T
...
3. Table Scan
5.7. Merge Join
(c)
Group 3:
1. Join(1,2)
1.5. Sorted Index Scan(R)
7.8. Sort
...
3. Table Scan
4. Sort 2.2
Group 2:
1. Get S
7.7 Hash Join
Group 1:
Initial Memo
1. Get R
...
4. Table Scan
5. Sorted Index Scan
Final Memo
2.3. Table Scan(S)
4.3. Table Scan(T)
FIGURE 2.7
The
MEMO
structure in a Cascades optimizer. (Used with per-
mission from Bruno, N. & Nehme, R. In
Proceedings of the ACM International
Conference on Management of Data [SIGMOD]
, 2008.)
to an execution plan for the previous query. As we described earlier, the
goal of the optimization process is to transform the original logical operator
tree into an ecient physical operator tree. For that purpose, Cascades-based
optimizers rely on two components: the
MEMO
data structure (which keeps
track of the explored search space) and
optimization tasks
(which guide the
search strategy). The following sections discuss these notions in more detail.
2.3.3.1
The Memo Data Structure
The
MEMO
data structure in Cascades provides a compact representation of
the search space of plans. In addition to enabling memoization (a variant of
dynamic programming), a
MEMO
provides detection of duplicate operator trees,
cost management, and other supporting infrastructure needed during query
optimization.
A
MEMO
consists of two key data structures, which we call
groups
and
groupExpressions
.A
group
represents all equivalent operator trees producing
the same output. To reduce memory requirements, a
group
does not explic-
itly enumerate all its operator trees. Instead, it implicitly represents all the
operator trees by using
groupExpressions
.A
groupExpression
is an operator
having other
groups
(rather than other operators) as children. As an example,
consider Figure 2.7b, which shows a
MEMO
for the simple query in the pre-
vious section (logical operators are shaded and physical operators use white
background). In the figure,
group
1 represents all equivalent expressions that
return the contents of table
R
. Some operators in
group
1 are logical (e.g.,
Get R
), and some are physical (e.g.,
Table Scan
and
Sorted Index Scan
, which
read the contents of
R
from the primary index or from an existing secondary
index, respectively). Likewise,
group
3 contains all the equivalent expressions
for
R
S
. Note that
groupExpression
3.1 (i.e.,
Join(1,2)
) represents all oper-
ator trees whose root is
Join
, first child belongs to
group
1, and second child
Search WWH ::
Custom Search