Database Reference
In-Depth Information
Using Lemma 5.1, we have
e με 2
Pr
[
X
<
k
]
Pr
[
X
< (
1
ε)μ ] <
2
Pr
[
X
<
k
] <
p holds if
e με 2
p
(5.2)
2
Combining Inequality 5.1 and 5.2, we get 2 ln p μ(
k
2 . The inequality in
1
μ )
Theorem 5.9 is the solution to the above inequality.
t )
value is monotonically increasing if tuples are
sorted in the ranking order. Using Theorem 5.9 an algorithm can stop and avoid
retrieving further tuples in the rear of the sorted list if the
Since
μ = ∑
Pr
(
, the
μ
t
T
(
t
)
μ
value of the current
tuple satisfies the condition in Theorem 5.9.
The value of parameter k is typically set to much smaller than the number of
tuples in the whole data set. Moreover, since a user is interested in the tuples with
a high probability to be ranked in top- k , the probability threshold p is often not too
small. Consequently,
μ
is often a small value. For example, if k
=
100, p
=
0
.
3, then
the stopping condition is
117.
In the experiments, we show in Figure 5.4 that the exact algorithm and the sam-
pling algorithm stop close to the general stopping condition. The results verify the
tightness of the stopping condition.
μ
5.4.3 A Poisson Approximation Based Method
When the success probability is small and the number of Poisson trials is large, Pois-
son binomial distribution can be approximated well by Poisson distribution [185].
For a set of Poisson trials X 1 ,...,
n
i
X n such that Pr
(
X i =
1
)=
p i , let X
=
1 X i .
=
n
i
X follows a Poisson binomial distribution. Let
μ =
E
[
X
]=
0 p i . The probability
=
k
k ! e μ , where f
μ)= μ
of X
=
k can be approximated by Pr
(
X
=
k
)
f
(
k ;
(
k ;
μ)
is the Poisson probability mass function. Thus, the probability of X
<
k can be
μ)= Γ ( k + 1 , μ)
k !
approximated by Pr
(
X
<
k
)
F
(
k ;
, where F
(
k ;
μ)
is the cumulative
y t x 1 e t dt is the
upper incomplete gamma function. Theoretically, Le Cam [186] showed that the
quality of the approximation has the upper bound
distribution function corresponding to f
(
k ;
μ)
, and
Γ (
x
,
y
)=
l
k = 0 Pr ( X = k )
l
k = 0 f ( k ; μ)
sup
0
9 ma i {
p i }.
l n
The above upper bound depends on only the maximum success probability in the
Poisson trials. In the worst case where max i {
1, the error bound is very loose.
However, our experimental results (Figure 5.6) show that the Poisson approximation
method achieves very good approximation quality in practice.
p i } =
 
Search WWH ::




Custom Search