Database Reference
In-Depth Information
k
-partitioned graph, i.e., a graph where the vertices are partitioned into
k
disjoint sets, and every edge
goes from some node in partition
i
to some node in partition
i
+
1, for some
i
=
1
,k
−
1. The ques-
tion is, how large should we choose
k
. Clearly, if we choose
k
2, i.e., we consider bipartite graphs,
then
Q
is always false, hence it is easy to compute
P (Q)
(it is 0). If we consider 3-partite graphs, then
there are two kinds of edges: from partition 1 to 2, denoted
U
1
(x, y)
and from partition 2 to 3, denoted
U
2
(x, y)
. These two sets are disjoint, hence
Q
is equivalent to
=
∃
x.
∃
y.
∃
z.U
1
(x, y), U
2
(y, z)
, and
this query has polynomial time data complexity (it can be computed using the rules described in the
next chapter, as
P (Q)
=
−
a
(
1
−
P(
∃
x.U
1
(x, a))
·
P(
∃
z.U
2
(a, z)))
, and
P(
∃
x.U
1
(x, a))
=
1
−
(
1
−
b
(
1
−
P(U
1
(b, a))))
, and similarly for
P(
∃
z.U
2
(a, z))
.). So
Q
is also easy on 3-partite
graphs.
Consider therefore 4-partite graphs. Now, there are three kinds of edges, denoted
U
1
,U
2
,U
3
,
and the query is equivalent to the following (we omit existential quantifiers):
1
U
1
(x, y), U
2
(y, z)
U
2
(y, z), U
3
(z, v)
Q
=
∨
In other words, a path of length 2 can either consist of two edges in
U
1
and
U
2
, or of two edges in
U
2
and
U
3
. Next, we make a further restriction on the 4-partite graph: we restrict the first partition
to have a single node
s
(call it “source node”), and the fourth partition to have a single node
t
(call
it “target node”). We prove that
Q
is hard even if the input is restricted to 4-partite graphs of this
kind. Indeed, then
Q
becomes:
Q
=
U
1
(s, y), U
2
(y, z)
∨
U
2
(y, z), U
3
(z, t)
This is precisely
H
1
=
R(y),S(y,z)
∨
S(y,z),T (z)
, up to the renaming of relations:
R(y)
≡
U
1
(s, y)
;
S(y,z)
U
2
(y, z)
; and
T(z)
U
3
(z, t)
.
≡
≡
3.3
BIBLIOGRAPHIC AND HISTORICAL NOTES
Valiant
[
1979
] introduced the complexity class #P and showed, among other things, that model
counting for propositional formulas is hard for #P, even when restricted to positive 2CNF (P2CNF).
Provan and Ball
[
1983
] showed that model counting for Partitioned Positive 2CNF (PP2CNF) is
also hard for #P.
There exists different kinds of reductions for proving #P-hardness, which result in slightly
different types of #P-hardness results: our proof of
Theorem 3.2
uses a 1-Turing reduction for the
hardness proof of
H
0
and a Turing reduction for the hardness proof of
H
1
.
Durand et al.
[
2005
]
discusses various notions of reductions.
The data complexity of queries on probabilistic databases was first considered by
Grädel et al.
[
1998
], which showed, in essence, that the Boolean query
R(x),S(x,y),R(y)
is hard for #P, by
reduction from P2DNF.
Dalvi and Suciu
[
2004
] consider conjunctive queries without self-joins,
and prove a dichotomy into polynomial-time and #P-hard queries. In particular, they prove that
H
0
=
R(x),S(x,y),T(y)
is #P-hard by reduction from PP2DNF (same proof as in this chapter).