Database Reference
In-Depth Information
5. INTENSIONAL QUERY EVALUATION
in polynomial time. Thus, in the rest of this chapter, we will assume that we may use V () instead
Va r () , in order to compute it efficiently.
Proposition 5.3 is somewhat surprising in the sense that it states that pairwise syntactic in-
dependence implies independence. This shows that syntactic independence is a stronger notion
than independence because, in general, pairwise independent events are not necessarily syntactically
independent. For example, consider:
1 =
X
2 =
Y
3 =
(
¬
X
∧¬
Y)
(X
Y)
Assuming P(X) = P(Y) = P(Z) = 1 / 2, we have P( 1 ) = P( 2 ) = P( 3 ) = 1 / 2. Also, 1
2 = 1 3 = 2 3 = XY , hence their probabilities are 1 / 4. Therefore, the three events
1 , 2 , 3 are pairwise independent, P( i j ) = P( i ) · P( j ) , for i = j . On the other
hand, P( 1
2
3 )
=
P(XY)
=
1 / 4
=
P( 1 )
·
P( 2 )
·
P( 3 )
=
1 / 8, meaning that the
three events are not independent.
It is interesting to note that a converse to Proposition 5.3 also holds:
P( 2 ),
for any choice of the probabilities P(X = a)of the atomic events, then 1 , 2 are syntactically independent.
If 1 , 2 are independent probabilistic events, i.e., P( 1
2 )
=
P( 1 )
·
Proposition 5.4
Proof. For each random variable X i , with domain Dom X i
={ a 0 ,a 1 ,...,a m i }
, we introduce m i
real variables x i, 1 ,x i, 2 ,...,x i,m i
representing the probabilities of the atomic events x i, 1 =
P(X
=
a 1 ),...,x i,m i
= P(X = a m i ) : we do not need a variable for P(X = a 0 ) since this value is uniquely
defined as 1
x i,m i . We claim that the probability P () of any propositional
formula is a multi-variate polynomial in the variables x i,j with the following properties: (a) it is
a multi-linear, i.e., the degree of each variable x i,j is a most 1, and (b) if two variables x i 1 ,j 1 and
x i 2 ,j 2 occur in the same monomial, then i 1 = i 2 . Indeed, the claim follows from the fact that for
each assignment θ , the probability P(θ) given by Eq. (2.2) obviously satisfies these two properties,
and the fact P () is a sum of such polynomials ( Eq. (2.3) ).
Using this claim, we prove the proposition. Let be any propositional formula and θ an
assignment to some of its variables, i.e., θ(X i ) = a j , for some subset of variables X i . Then, the
polynomial P( [ θ ] ) is obtained from P() by performing the following substitution on the variables
x i,j for which θ(X i ) is defined, θ(X i )
x i, 1
x i, 2
...
=
a j :if j> 0 then substitute x i,j
=
1 and set all other
x i,k = 0, and if j = 0 then set substitute all variables x i, 1 = x i, 2 = ... = 0.
Suppose now Eq. (5.1) holds as an identity. Assume that X i
Va r ( 1 )
Va r ( 2 ) : we will
derive a contradiction. Since X i
Va r ( 1 ) , there exists some substitution θ of all variables other than
X i , and two values a j 1
; that means that one
of the two expressions is true and the other false , implying P( 1 [ θ,X i a j 1 ] ) = P( 1 [ θ,X i
a j 2 ] ) (one is 0 the other is 1). The polynomial P( 1 [ θ ] ) can only depend on the variables
x i, 1 ,x i, 2 ,... , since all variables other than X i are substituted with constants. Since it cannot be
and a j 2
such that 1 [
θ,X i
a j 1 ] =
1 [
θ,X i
a j 2 ]
Search WWH ::




Custom Search