Database Reference
In-Depth Information
4. EXTENSIONAL QUERY EVALUATION
The proposition implies immediately.
Corollary 4.32 Let Q = Q 1 Q 2 ... Q m be a UCQ query where each conjunctive query Q i is
without self-joins. Then any extensional plan for Q that uses the operators independent join, independent
project, independent union, and selection, computes an upper bound for the probabilities of Q.
For example, the plan on the left in Figure 4.3 is guaranteed to return an upper bound of
the correct probabilities of the query in the figure. However, the corollary is much more interesting
when applied to an unsafe query. Consider, for example, Q(z) = R(z,x),S(x,y),T(y) , which, for
a fixed output value z , is equivalent H 0 in Subsection 4.1.3 ; in other words, Q is provably hard for
#P. There are three extensional plans for Q :
z [
z,x (R(z, x)
i
i
P 1 =
x S(x,y))
y T(y)
]
P 2 = z [ R(z, x)
i
x i x (S(x, y)
i
y T(y)) ]
P 3 = z [ R(z, x)
i x S(x,y)
i y T(y) ]
Each of them is guaranteed to return an upper bound on the true probability for Q . Therefore,
if we evaluate all three plans and take the minimum probability for each output tuple, we get a tighter
upper bound. Moreover, it is possible to show that the probabilities returned by P 3 arealwaysan
upper bound of those returned by either P 1 and P 2 (because the lineage that P 3 computes is a
dissociation of the lineage by P 1 , and also a dissociation of the lineage computed by P 2 ). Thus,
P 3 does not provide any additional information, and it suffices to evaluate only P 1 and P 2 .Wecan
generalize this example to any query without self-joins 5 and therefore compute approximate answers
quite efficiently. The disadvantage, however, is that there is no guarantee on how tight the upper
bound is.
4.3
EXTENSIONS
We discuss here several extensions to query evaluation.
4.3.1 BID TABLES
So far, in this chapter, we have considered only tuple-independent tables. In a BID table, the tu-
ples are partitioned into blocks, such that tuples from different blocks are independent, and tuples
within a block are disjoint ( Section 2.7 ). The query evaluation techniques discussed in this chapter
extend with minor modifications to BID tables. Recall that a BID table has a schema of the form
R( A 1 ,...,A k ,B 1 ,...,B m ) , where the attributes A 1 ,...,A k uniquely identify a block; in other
words, they are a key in any possible world. For example, we write T( z ,y) to say that z isakey
attribute; in any possible world, there cannot be two distinct tuples T(b,c 1 ) and T(b,c 2 ) . But the
5 The dissociation inequality no longer holds for self-joins.
Search WWH ::




Custom Search