Database Reference

In-Depth Information

instances. We next discuss an approach to limit the set of views to consider

in bottom-up enumeration strategies.

8.4.1 Restricting the Enumeration Space

An observation that motivates the following approach is that certain subsets

of tables are not particularly helpful to consider. Specifically, even if we were

to propose materialized views on those table subsets, the resulting configu-

ration would be only slightly more ecient. This can happen because the

table subsets occur either infrequently in the workload or only in inexpensive

queries. For instance, consider a workload of 100 queries whose total cost is

10,000 units. Let
T
be a table subset that occurs in 25 queries whose combined

cost is 50 units. Even if we considered all syntactically relevant materialized

views on
T
, the maximum possible benefit of those materialized views for the

workload would be 0

5%. Furthermore, even among table subsets that occur

frequently or in expensive queries, not all table subsets are likely to be equally

useful. Materialized views proposed on subsets of
large
tables are more likely

to be useful than materialized views proposed on smaller ones. Based on these

observations, an approach to restrict the enumeration space can be described

as follows:

1. From the large space of all possible table subsets for the workload, we

arrive at a smaller set of interesting table subsets. Based on the previous

discussion, a metric that captures the relative importance of a table

subset
T
is

.

t
∈
T
size

)

t
mentioned in
q
i
size

(

t

TSW

(

T

)
=

cost

(

q
i
)
·

)

That is, we sum over all queries that reference every table in
T
, the cost

of the query under the current configuration multiplied by the fraction

of the size between tables in
T
and all tables mentioned in the query.

TSW
is a simple function and can discriminate between table subsets

even if they occur in exactly the same queries in the workload. However,

it is not clear how to enumerate all interesting table subsets short of

generating and evaluating every possible subset of tables. In practice,

we can use a relaxed metric
TSC(

(

t

q
i
∈

W
referencing
T

q
i
∈
W
referencing
T

q
i
)
.In

contrast to
TSW
,
TSC
satisfies the monotonicity property, in which

T
1
⊆

T

)
=

cost

(

. This makes it possible to use ecient

algorithms to identify all table subsets that exceed a certain threshold.

Since
TSW

T
2
⇒

TSC

(

T
1

)
≥

TSC

(

T
2

)

, we can postprocess the resulting list of the

top-ranked tuples according to
TSC
and obtain all interesting table

subsets according to the
TSW
ranking function.

2. We recommend candidate views that are defined over interesting table

subsets only, by restricting the techniques discussed in Section 8.2.

3. Starting with the views obtained in the previous step, we generate an

additional set of merged materialized views in a controlled manner. We

(

T

)
≤

TSC

(

T

)

Search WWH ::

Custom Search