Database Reference
In-Depth Information
In a Poisson binomial distribution X , the probability density of X is unimodal
(i.e., first increasing then decreasing), and attains its maximum at
μ =
E
[
X
]
[183].
Therefore, when the query parameter k varies from 1 to
|
T
(
t
) | +
1, Pr
(
t
,
k
)
follows
the similar trend.
Corollary 5.3 (Distribution of rank- k probability). For a tuple t
T,
=
0
,
if k
>|
T
(
t
) | +
1 ;
1. Pr
(
t
,
k
)
<
Pr
(
t
,
k
+
1
) ,
if k
μ
1 ;
>
Pr
(
t
,
k
+
1
) ,
if k
μ
.
2. arg max | T ( t ) | + 1
j =
t )
Pr
(
t
,
j
)=μ +
1
,
where
μ = t T ( t ) Pr
(
.
1
5.4.2 A General Stopping Condition
Corollary 5.3 shows that, given a tuple t and its compressed dominant set T
(
t
)
,
the most possible ranks of t are around
1, then
the top- k probability of t is small. Now, let us use this property to derive a general
stopping condition for query answering algorithms progressively reading tuples in
the ranking order. That is, once the stopping condition holds, all unread tuples that
cannot satisfy the query can be pruned. The stopping condition is independent from
the number of tuples in the data set, and dependent on only the query parameter k
and the probability threshold p .
μ +
1. In other words, if k
μ +
Theorem 5.9 (A General Stopping Condition). Given a top-k query Q k
(
f
)
and
t )
. Then, Pr k
probability thresh old p, for a tup le t
T , let
μ = t T ( t ) Pr
(
(
t
) <
pif
ln 2 p +
2 k ln p .
Proof. To prove Theorem 5.9, we need Theorem 4.2 in [184].
ln p +
μ
k
+
Lemma 5.1 (Chernoff Bound of Poisson Trials [184]). Let X 1
,...,
X n be indepen-
[
=
]=
<
<
dent Poisson trials such that, for 1
i
n, Pr
X i
1
p i , where 0
p i
1 . Then,
n
i = 1 X i ,
n
i = 1 p i , and 0
for X
=
μ =
[
]=
< ε
E
X
1 , we have
e με 2
Pr
[
X
< (
1
ε)μ ] <
.
2
As discussed in Section 5.4.1, we can construct a set of Poisson trials correspond-
ing to the tuples in T
such that, for each tuple or rule-tuple t
(
t
)
T
(
t
)
, there is a
t )
corresponding trial whose success probability is the same as Pr
(
. Moreover,
k
1
j = 0 Pr ( T ( t ) , j )= Pr [ X < k ]
For 0
< ε
1, inequality Pr
[
X
<
k
]
Pr
[
X
< (
1
ε)μ ]
holds when
k
(
1
ε)μ
(5.1)
 
Search WWH ::




Custom Search