Database Reference
In-Depth Information
of the rules and the query expression, while tractability is a semantic property, and the question
whether the two notions coincide is not easy. The converse also holds, but only for certain query
languages
.
For a query language
L
L
L
L
. Thus, RC(P )
denotes all tractable relational queries, UCQ(P ) denotes all tractable unions of conjunctive queries,
and CQ(P ) denotes all tractable conjunctive queries. We say that the rules R 6 are complete for
, we denote
(P ) the class of tractable queries in
L
L
if every query in
(P ) is R 6 -safe.
4.1.6.1 The Dichotomy Theorem for UCQ
When the query language under discussion is UCQ, then the rules R 6 are complete: every query on
which they fail is provably hard for #P.
For any Union of Conjunctive Queries Q , one of the following holds:
Theorem 4.23
￿ Q is R 6 -safe, or
￿ The data complexity of Q is hard for # P .
The proof is quite difficult, and can be found in [ Dalvi and Suciu , 2010 ]. The result means
that the queries in UCQ admit a dichotomy: they are either computable in polynomial time, or they
are provably hard for #P. The classification into safe or unsafe queries is based only on the syntax of
the query Q : if the rules R 6 apply, then Q is safe; otherwise, it is unsafe. Here “syntax” is used in a
very loose sense; for example, it includes the Möbius function on the query's CNF lattice.
The theorem tells us exactly which UCQ queries are hard and which are easy although the
criteria used to decide safety is somewhat unsatisfactory. Given Q , apply the rules: if we do not get
stuck, then Q is safe (and tractable); otherwise, it is unsafe. One of the rules, independent-project ,
depends on the active domain of the database: P( x.Q) = 1 a ADom ( 1 P(Q [ a/x ] )) , but in
order to check safety, we can simply consider an active domain of size 1: check if P(Q
) is safe
for every constant a that occurs in Q and check if P(Q [ a/x ] ) is safe for one single constant a that
does not occur in Q . Thus, one can naively apply repeatedly the rules R 6 and check if Q is safe: this
requires two exponential steps at each application of the Möbius function because we need to rewrite
the UCQ query from DNF to CNF, then we need to enumerate all subsets of the CNF expression,
and therefore one can check safety in time super-exponential in the size of the query Q . Currently,
it is unknown if there is a simpler criteria for checking safety; in particular, the query complexity for
safety is unknown.
[
a/x
]
Search WWH ::




Custom Search