Database Reference
In-Depth Information
where X 1 and X 1 are two clones of X 1 . In other words, the reason why the plan on the left is unsafe is
because it fails to recognize that two occurrences of X 1 are the same variable, but instead treats them
as independent variables, with the same probability as X 1 . This is captured by d , where we have
replaced the two occurrences of X 1 with two distinct variables X 1 and X 1 , with the same probability
p 1 as X 1 . The following proposition shows that P ()
P( d ) :
Proposition 4.31 Consider two propositional formulas:
d
k X k
= 0 1 X ... k X
=
0
1 X 1
...
such that P(X) = P(X 1 ) = ... = P(X k ) and none of the formulas i , i = 0 ,k contains any of the
variables X, X 1 ,...,X k . Then P () P( d ).
Intuitively, the proposition says this. We have a formula that has multiple occurrences of
some variable X : all these occurrences are positive (i.e., non-negated), and separated by
.Ifwe
“clone” each occurrence into a new variable X i , such that all the clones are independent and have
the same probability as X , then the probability can only increase. This cloning is called dissocia-
tion [ Gatterbauer et al. , 2010 , Gatterbauer and Suciu , 2011 ] and is related to relaxation [ Darwiche ,
2010 ] but differs from the latter in that it gives an upper bound guarantee.
Proof. We claim that P() P( d ) , where
d
2 X 2
= 1 X 2 X
=
1 X 1
and x
P(X 2 ) . The claim follows from the inclusion-exclusion formula:
P() = xP( 1 ) + xP( 2 ) xP( 1 2 ) P( d ) = xP( 1 ) + xP( 2 ) x 2 P( 1 2 )
thus P() P( d ) because x 2
=
P(X)
=
P(X 1 )
=
x .
We prove now the proposition by induction on k . For the base case, k
=
2, denote i =
( ¬ 0 ) i , for i = 1 , 2, then we have:
d
d
=
0
(
¬
0 ) 1 X 1
(
¬
0 ) 2 X 2 =
0
= 0 ( ¬ 0 ) 1 X ( ¬ 0 ) 2 X = 0
The base case follows from P( 0 ) = P( 0 ) + P() P( 0 ) + P( d ) = P( 0 d ) .We
prove now the inductive step. Consider:
d
= 0 1 X 1 ... k 2 X k 2 k 1 X k 1 k X k
=
k )Y
= 0 1 X ... k 2 X ( k 1 k )X
Then P () P( ) follows by induction (for k 1), and P( ) P( d ) follows from the base
case ( k =
0
1 X 1
...
k 2 X k 2
( k 1
2).
Search WWH ::




Custom Search