Extensional Query Evaluation - Probabilistic Databases

Database Reference

In-Depth Information

Lemma 4.4

Let x be a separator variable in Q

=∃

x.Q . Then for any two distinct constants a

the queries Q [ a/x ]

and Q [ b/x ]

are syntactically independent.

Proof. Suppose the contrary, and let L 1 , L 2 be two atoms in Q [ a/x ]

and Q [ b/x ]

, respectively, such

that L 1 ,L 2 unify. Let L 1 ,L 2 be the atoms in Q from which L 1 ,L 2 are derived, i.e., L 1 =

L 1 [

a/x

]

and L 2 = L 2 [ b/x ]

. Then, obviously, L 1 ,L 2 unify; hence, since x is a separator variable, we must have

. But, in that case, L 1 has a on a position where L 2 has b , contradicting

Pos(L 1 ,x)

∩

Pos(L 2 ,x)

=∅

the fact that they unify.

The independent project rule is the following. Suppose a relational query can be written as

∃ x.Q where x is a separator variable. Then:

P( ∃ x.Q)

−

− P(Q [ a/x ] ))

Independent-project

( 1

(4.2)

a ∈

ADom

ADom is the active domain of the database D . For a simple example, consider the query

Q =∃ x. ∃ y.(R(x), S(x, y)) . Here x is both root variable and a separator variable; therefore, if the

active domain is ADom

≤ i ≤ n are

independent events. Denote p i = P(R(a i ) ∧∃ y.S(a i ,y)) ; then, by the independent-project rule,

the probability of Q is P (Q) =

={ a 1 ,...,a n }

, then the queries Q i = R(a i ) ∧∃ y.S(a i ,y) , 1

− i ( 1

− p i ) .

While every separator variable is also a root variable, the converse is not true in general; in

∃

z.U(y,x),U(x,z) , the variable x is a root variable, but not a separator variable. It is easy

to see that we cannot apply the independent project rule on the variable x because the two queries

∃

z.U(y,b),U(b,z) are dependent events; they both depend on the

tuple U(a,b) (and also on U(b,a) ). We have seen in Example 3.3 that this query is hard.

To obtain separator variables, we often apply the following logical equivalence:

∃

z.U(y,a),U(a,z) and

∃

∃ x 1 .Q 1 ∨∃ x 2 .Q 2 =∃ x.(Q 1 [ x/x 1 ]∨ Q 2 [ x/x 2 ] )

(4.3)

For example, consider the query:

Q U

=∃

x 1 .

∃

y 1 .R(x 1 ), S(x 1 ,y 1 )

∨∃

x 2 .

∃

y 2 .T (x 2 ), S(x 2 ,y 2 )

(4.4)

The query can be written as

∃ x.( ∃ y 1 .R(x), S(x, y 1 ) ∨∃ y 2 .T (x), S(x, y 2 )) , and now x is a separator

variable.

4.1.2.3 Rule 3: Independent Union

Suppose a relational query can be written as Q 1 ∨ Q 2

where Q 1

and Q 2

are two syntactically

independent queries. Then:

Independent-union

P(Q 1 ∨ Q 2 ) =

− ( 1

− P(Q 1 )) · ( 1

− P(Q 2 ))

(4.5)

Probabilistic Databases

Search WWH ::

Custom Search

Home