Database Reference
In-Depth Information
instances. We next discuss an approach to limit the set of views to consider
in bottom-up enumeration strategies.
8.4.1 Restricting the Enumeration Space
An observation that motivates the following approach is that certain subsets
of tables are not particularly helpful to consider. Specifically, even if we were
to propose materialized views on those table subsets, the resulting configu-
ration would be only slightly more ecient. This can happen because the
table subsets occur either infrequently in the workload or only in inexpensive
queries. For instance, consider a workload of 100 queries whose total cost is
10,000 units. Let T be a table subset that occurs in 25 queries whose combined
cost is 50 units. Even if we considered all syntactically relevant materialized
views on T , the maximum possible benefit of those materialized views for the
workload would be 0
5%. Furthermore, even among table subsets that occur
frequently or in expensive queries, not all table subsets are likely to be equally
useful. Materialized views proposed on subsets of large tables are more likely
to be useful than materialized views proposed on smaller ones. Based on these
observations, an approach to restrict the enumeration space can be described
as follows:
1. From the large space of all possible table subsets for the workload, we
arrive at a smaller set of interesting table subsets. Based on the previous
discussion, a metric that captures the relative importance of a table
subset T is
.
t T size
)
t mentioned in q i size
(
t
TSW
(
T
) =
cost
(
q i ) ·
)
That is, we sum over all queries that reference every table in T , the cost
of the query under the current configuration multiplied by the fraction
of the size between tables in T and all tables mentioned in the query.
TSW is a simple function and can discriminate between table subsets
even if they occur in exactly the same queries in the workload. However,
it is not clear how to enumerate all interesting table subsets short of
generating and evaluating every possible subset of tables. In practice,
we can use a relaxed metric TSC(
(
t
q i
W referencing T
q i W referencing T
q i ) .In
contrast to TSW , TSC satisfies the monotonicity property, in which
T 1
T
) =
cost
(
. This makes it possible to use ecient
algorithms to identify all table subsets that exceed a certain threshold.
Since TSW
T 2
TSC
(
T 1
)
TSC
(
T 2
)
, we can postprocess the resulting list of the
top-ranked tuples according to TSC and obtain all interesting table
subsets according to the TSW ranking function.
2. We recommend candidate views that are defined over interesting table
subsets only, by restricting the techniques discussed in Section 8.2.
3. Starting with the views obtained in the previous step, we generate an
additional set of merged materialized views in a controlled manner. We
(
T
)
TSC
(
T
)
Search WWH ::




Custom Search