Intensional Query Evaluation - Probabilistic Databases

Database Reference

In-Depth Information

5.1.3 READ-ONCE FORMULAS

An important class of propositional formulas that play a special role in probabilistic databases are

read-once formulas. We restrict our discussion to the case when all random variables X are Boolean

variables.

is called read-once if there is a formula

equivalent to such that every variable occurs

at most once in . For example:

X 1 Y 1 ∨

X 1 Y 2 ∨

X 2 Y 3 ∨

X 2 Y 4 ∨

X 2 Y 5

is read-once because it is equivalent to the following formula:

= X 1 (Y 1 ∨ Y 2 ) ∨ X 2 (Y 3 ∨ Y 4 ∨ Y 5 )

Read-once formulas admit an elegant characterization, which we describe next.

A formula is called unate if every propositional variable X occurs either only positively ( X ),

or only negatively (

¬ X )in .Let be written in DNF and assume each conjunct is a minterm; that

is, no other conjunct is a strict subset (otherwise, absorption rule applies, and we can further simplify

in polynomial time). The primal graph of is G P = (V , E) where V is the set of propositional

variables in , and for every pair of variables X, Y that occur together in some conjunct, there is an

edge (X, Y ) in E . Thus, every conjunct in becomes a clique in G P .

Let P 4 denote the following graph with 4 vertices: u − v − w − z ; that is, P 4 consists of a path

of length 4 and no other edges. We say that is P 4 -free if no subgraph induced by 4 vertices in G P is

isomorphic to P 4 . Finally, we say that is normal if for every clique in G P , there exists a conjunct in

containing all variables in the clique. The following characterization of unate read-once formulas

is due to Gurvich [ 1991 ].

Theorem 5.8

A unate formula is read-once iff it is P 4 -free and normal.

For example, the primal graph of XU ∨ XV ∨ YU ∨ YV is the complete bipartite graph with

vertices

; thus, it is P 4 -free and normal: sure enough, the formula can be written as

(X ∨ Y) ∧ (U ∨ V) , which is read-once. On the other hand, the primal graph of XY ∨ YZ ∨ ZU

is precisely P 4 ; hence, it is not read-once. The primal graph of XY

{ X, Y }

and

{ U, V }

∨

YZ contains the clique

{ X, Y, Z }

but there is no minterm XYZ , hence the formula is not normal and, therefore, not read-

once.

If is given as a read-once expression, then P () can be computed in linear time, by simply

running Algorithm 2 , and applying only the independent-and, independent-or, and negation rules.

Moreover, if a unate DNF formula admits a read-once equivalent , then can be computed

from in polynomial time [ Golumbic et al. , 2005 ]. Thus, read-once formulas can be evaluated very

efficiently, and this justifies our special interest in this class of formulas.

Probabilistic Databases

Search WWH ::

Custom Search

Home