Database Reference
In-Depth Information
The expected value of Z k is
P(φ i )
M
P(θ)
P(φ i ) ·
1
E
[ Z k ]=
·
|{
φ j
|
θ
ω(φ j )
}|
i
θ ω(φ i )
P(θ) ·|{ φ i | θ ω(φ i ) }|
M ·|{ φ j
=
| θ ω(φ j ) }|
θ :∃ φ i θ ω(φ i )
1
M ·
=
P(θ)
θ :∃ φ i θ ω( φ i )
P ()
p
M ,
=
[ Z ]= N · p/M. We approximate p thus by
p =
so Z k
is an unbiased estimator for p/M and E
Z · M/N , and its expected value is E
[ Z M/N = p .
Computing Z consists of summing up the outcome of N Bernoulli trials. For such a scenario,
we can use the Chernoff bound
Pr | Z
[ p ]=
E
[ Z ]
· e ε 2
·
E
[ Z ] / 3
[ Z ]| ≥ ε ·
E
E
2
(cf., e.g., [ Mitzenmacher and Upfal , 2005 ], Eq. 4.6). By substitution, we get
Pr N
Pr | p p |≥ ε · p =
N · p
M
N · p · ε 2
3
2 · e
M ·| p p |≥ ε ·
·
M
and thus since p/M
1 /n ,
Pr | p p |≥ ε · p
N · ε 2
3 · n
· e
2
= δ
By choosing
3
log δ
ε 2
· n ·
N =
(5.9)
we get an (ε, δ) fully polynomial-time randomized approximation scheme (FPTRAS) for computing
the probability of a DNF over independent Boolean random variables.
Thus, we have two algorithms, the naïve Monte Carlo, and Karp-Luby, for approximating
the probability of a DNF expression. The question whether one is preferable over the other depends
on whether queries usually compute only large probabilities or not. Consider the two bounds on N,
for the naïve and the Karp-Luby algorithm, given by Eq. (5.8) and Eq. (5.9) , respectively. The first
is of the form N = C(ε, δ)/p while the second is of the form N = C(ε,δ) n , where C(ε,δ) is
O( log ( 2 /δ)/ε 2 ) . Recall that p = P () is the probability of the formula, while n is the number of
conjuncts in . This suggests a trade-off between the two algorithms; the naïve MC is preferable
 
Search WWH ::




Custom Search