Database Reference
In-Depth Information
4.3.3 KEYS IN THE REPRESENTATION
It is common to have a key attribute in the representation of a tuple-independent or a BID table. For
example, consider a tuple-independent relation S(x,y) , and assume that in its representation the
attribute x is a key. This means that the tuples continue to be independent, but that the x attribute
is distinct among all possible tuples.
In general, if x is a key in the representation of a table, then it is also a key in all possible
worlds since a possible world is just some subset of the representation; the table is like a BID table
where every block is uniquely identified by x and contains a single tuple. But the converse does not
hold: if x is key in every possible world, then it means that S(x,y) is a BID-table, and it does not
imply that x is a key in the representation.
If we know which attributes are keys in the representation, then we may be able to find safe
plans for queries that otherwise would be intractable. More generally, assume an arbitrary set of
functional dependencies. For any conjunctive query without self-joins, one can check if it is tractable
using the following process, similar to a chase. We repeat until no more change is possible: if a
functional dependency X y holds, where X is a set of variables, then add the variable y to all
other atoms that contain all variables in X . Once the chase terminates, check if the resulting query
is safe: the original query is tractable iff the chased query is tractable.
For example, assume that x
=
R(x),S(x,y),T(y) : this is the same as H 0 , but we denote it differently to emphasize that it is over
a table S(x,y) where x is a key attribute. Then the chase process described above rewrites Q to
Q = R(x,y),S(x,y),T (y) , which is a safe query. In other words, if we know that x isakeyin
the representation of S , then the query becomes tractable. It is known that the chase procedure is
complete for conjunctive queries without self-joins: it is unknown whether it is complete, or even
how to extend it to a richer class of queries.
y holds in S(x,y) , and consider the Boolean query Q
4.4
BIBLIOGRAPHIC AND HISTORICAL NOTES
Safe queries and safe query plans are introduced by Dalvi and Suciu [ 2004 ]. They also prove a
dichotomy into polynomial time and #P-hard for conjunctive queries without self-joins over tuple-
independent tables. The evaluation algorithm for UCQ queries that we discussed in Section 4.1
was first introduced by Dalvi et al. [ 2010 ]. They introduced the inclusion-exclusion rule and its
extension to Möbius inversion formula, and also announced the dichotomy result in Theorem 4.23 .
The complete proof of this theorem is in [ Dalvi and Suciu , 2010 ].
The safe plans discussed in Section 4.2 have an important practical limitation: they restrict
severely the choices of the query optimizers. Olteanu et al. [ 2009 ] address this problem, by decou-
pling the data processing part of the query plan from the probabilistic inference part. They introduce
a new type of plan that allows the optimizer to choose the best plan for the data processing part, yet
allowing the probabilistic inference to take advantage of the query's safety. In that framework, a safe
plan is an eager plan, where all probabilistic computations are performed as early as possible; lazy
plans are at the other extreme as they compute the probabilities after the result tuples are computed.
Search WWH ::




Custom Search