Database Reference
In-Depth Information
5.1.3 READ-ONCE FORMULAS
An important class of propositional formulas that play a special role in probabilistic databases are
read-once formulas. We restrict our discussion to the case when all random variables X are Boolean
variables.
is called read-once if there is a formula
equivalent to such that every variable occurs
at most once in . For example:
=
X 1 Y 1
X 1 Y 2
X 2 Y 3
X 2 Y 4
X 2 Y 5
is read-once because it is equivalent to the following formula:
= X 1 (Y 1 Y 2 ) X 2 (Y 3 Y 4 Y 5 )
Read-once formulas admit an elegant characterization, which we describe next.
A formula is called unate if every propositional variable X occurs either only positively ( X ),
or only negatively (
¬ X )in .Let be written in DNF and assume each conjunct is a minterm; that
is, no other conjunct is a strict subset (otherwise, absorption rule applies, and we can further simplify
in polynomial time). The primal graph of is G P = (V , E) where V is the set of propositional
variables in , and for every pair of variables X, Y that occur together in some conjunct, there is an
edge (X, Y ) in E . Thus, every conjunct in becomes a clique in G P .
Let P 4 denote the following graph with 4 vertices: u v w z ; that is, P 4 consists of a path
of length 4 and no other edges. We say that is P 4 -free if no subgraph induced by 4 vertices in G P is
isomorphic to P 4 . Finally, we say that is normal if for every clique in G P , there exists a conjunct in
containing all variables in the clique. The following characterization of unate read-once formulas
is due to Gurvich [ 1991 ].
Theorem 5.8
A unate formula is read-once iff it is P 4 -free and normal.
For example, the primal graph of XU XV YU YV is the complete bipartite graph with
vertices
; thus, it is P 4 -free and normal: sure enough, the formula can be written as
(X Y) (U V) , which is read-once. On the other hand, the primal graph of XY YZ ZU
is precisely P 4 ; hence, it is not read-once. The primal graph of XY
{ X, Y }
and
{ U, V }
XZ
YZ contains the clique
{ X, Y, Z }
but there is no minterm XYZ , hence the formula is not normal and, therefore, not read-
once.
If is given as a read-once expression, then P () can be computed in linear time, by simply
running Algorithm 2 , and applying only the independent-and, independent-or, and negation rules.
Moreover, if a unate DNF formula admits a read-once equivalent , then can be computed
from in polynomial time [ Golumbic et al. , 2005 ]. Thus, read-once formulas can be evaluated very
efficiently, and this justifies our special interest in this class of formulas.
Search WWH ::




Custom Search