Database Reference
In-Depth Information
5.1.3 READ-ONCE FORMULAS
An important class of propositional formulas that play a special role in probabilistic databases are
read-once formulas. We restrict our discussion to the case when all random variables
X
are Boolean
variables.
is called
read-once
if there is a formula
equivalent to
such that every variable occurs
at most once in
. For example:
=
X
1
Y
1
∨
X
1
Y
2
∨
X
2
Y
3
∨
X
2
Y
4
∨
X
2
Y
5
is read-once because it is equivalent to the following formula:
=
X
1
(Y
1
∨
Y
2
)
∨
X
2
(Y
3
∨
Y
4
∨
Y
5
)
Read-once formulas admit an elegant characterization, which we describe next.
A formula
is called
unate
if every propositional variable
X
occurs either only positively (
X
),
or only negatively (
¬
X
)in
.Let
be written in DNF and assume each conjunct is a minterm; that
is, no other conjunct is a strict subset (otherwise, absorption rule applies, and we can further simplify
in polynomial time). The
primal graph
of
is
G
P
=
(V , E)
where
V
is the set of propositional
variables in
, and for every pair of variables
X, Y
that occur together in some conjunct, there is an
edge
(X, Y )
in
E
. Thus, every conjunct in
becomes a clique in
G
P
.
Let
P
4
denote the following graph with 4 vertices:
u
−
v
−
w
−
z
; that is,
P
4
consists of a path
of length 4 and no other edges. We say that
is
P
4
-free
if no subgraph induced by 4 vertices in
G
P
is
isomorphic to
P
4
. Finally, we say that
is
normal
if for every clique in
G
P
, there exists a conjunct in
containing all variables in the clique. The following characterization of unate read-once formulas
is due to
Gurvich
[
1991
].
Theorem 5.8
A unate formula
is read-once iff it is
P
4
-free and normal.
For example, the primal graph of
XU
∨
XV
∨
YU
∨
YV
is the complete bipartite graph with
vertices
; thus, it is
P
4
-free and normal: sure enough, the formula can be written as
(X
∨
Y)
∧
(U
∨
V)
, which is read-once. On the other hand, the primal graph of
XY
∨
YZ
∨
ZU
is precisely
P
4
; hence, it is not read-once. The primal graph of
XY
{
X, Y
}
and
{
U, V
}
∨
XZ
∨
YZ
contains the clique
{
X, Y, Z
}
but there is no minterm
XYZ
, hence the formula is not normal and, therefore, not read-
once.
If
is given as a read-once expression, then
P ()
can be computed in linear time, by simply
running
Algorithm 2
, and applying only the independent-and, independent-or, and negation rules.
Moreover, if a unate DNF formula
admits a read-once equivalent
, then
can be computed
from
in polynomial time [
Golumbic et al.
,
2005
]. Thus, read-once formulas can be evaluated very
efficiently, and this justifies our special interest in this class of formulas.