Database Reference
In-Depth Information
(a)
(b)
Join
Group 7:
1. Join(2,4)
...
7. HJ(2.3, 4.3)
8. Sort 7.7
Join
Get T
Group 6:
1. Join(1,4)
1. Join(4,1)
...
3. LJ(1.4, 4.3)
Get R
Get S
Group 5:
1. Join(3,4)
2. Join(1,7)
...
6. HJ(4.3, 2.3)
7. MJ(1.5, 7.8)
Group 4:
1. Get T
...
3. Table Scan
5.7. Merge Join
(c)
Group 3:
1. Join(1,2)
1.5. Sorted Index Scan(R)
7.8. Sort
...
3. Table Scan
4. Sort 2.2
Group 2:
1. Get S
7.7 Hash Join
Group 1:
Initial Memo
1. Get R
...
4. Table Scan
5. Sorted Index Scan
Final Memo
2.3. Table Scan(S)
4.3. Table Scan(T)
FIGURE 2.7 The MEMO structure in a Cascades optimizer. (Used with per-
mission from Bruno, N. & Nehme, R. In Proceedings of the ACM International
Conference on Management of Data [SIGMOD] , 2008.)
to an execution plan for the previous query. As we described earlier, the
goal of the optimization process is to transform the original logical operator
tree into an ecient physical operator tree. For that purpose, Cascades-based
optimizers rely on two components: the MEMO data structure (which keeps
track of the explored search space) and optimization tasks (which guide the
search strategy). The following sections discuss these notions in more detail.
2.3.3.1
The Memo Data Structure
The MEMO data structure in Cascades provides a compact representation of
the search space of plans. In addition to enabling memoization (a variant of
dynamic programming), a MEMO provides detection of duplicate operator trees,
cost management, and other supporting infrastructure needed during query
optimization.
A MEMO consists of two key data structures, which we call groups and
groupExpressions .A group represents all equivalent operator trees producing
the same output. To reduce memory requirements, a group does not explic-
itly enumerate all its operator trees. Instead, it implicitly represents all the
operator trees by using groupExpressions .A groupExpression is an operator
having other groups (rather than other operators) as children. As an example,
consider Figure 2.7b, which shows a MEMO for the simple query in the pre-
vious section (logical operators are shaded and physical operators use white
background). In the figure, group 1 represents all equivalent expressions that
return the contents of table R . Some operators in group 1 are logical (e.g.,
Get R ), and some are physical (e.g., Table Scan and Sorted Index Scan , which
read the contents of R from the primary index or from an existing secondary
index, respectively). Likewise, group 3 contains all the equivalent expressions
for R
S . Note that groupExpression 3.1 (i.e., Join(1,2) ) represents all oper-
ator trees whose root is Join , first child belongs to group 1, and second child
Search WWH ::




Custom Search