Probabilistic Ranking Queries on Uncertain Data - Ranking Queries on Uncertain Data

Database Reference

In-Depth Information

5.5.3 PRist+ and a Fast Construction Algorithm

We can reduce the construction time of PRist by bounding top- k probabilities using

the binomial distribution.

Consider a tuple t

. If there is any tuple

or generation rule-tuple with probability 1, we can remove the tuple from T

∈

T and its compressed dominant set T

(

)

(

)

, and

(

−

)

compute the top-

probability of t . Thus, we can assume that the membership

probability of any tuple or rule-tuple in T

(

)

is smaller than 1.

Theorem 5.14 (Bounding the probability). For a tuple t

∈

T , let T

(

)

be the com-

pressed dominant set of t. Then,

p max ) ≤ ∑

(

k ; N

(

) ,

) ≤

(

k ; N

p min )

(5.3)

≤

where p max and p min are the greatest and the smallest probabilities of the tuples/rule-

tuples in T

(

)(

p min ≤

p max <

)

, N is the number of tuples/rule-tuples in T

(

)

and F is the cumulative distribution function of the binomial distribution.

Proof. We first prove the left side of Inequality 5.3. For a tuple set S , let Pr

(

,≤

)

. For any tuple t ∈

t ) ≤

denote

k Pr

(

)

(

)

, Pr

(

p max .

∑

≤

t }

and T (

Consider tuple set S

(

) −{

t max where Pr

(

t max )=

p max .

From Theorem 5.2,

t )

t ))

(

) ,≤

(

,≤

−

)+(

−

(

,≤

)

;

and

T (

(

) ,≤

(

)

(

,≤

−

)+(

−

(

))

(

,≤

) .

t max

Then,

T (

t ) −

(

) ,≤

) −

(

) ,≤

)=[

(

t max )] × [

(

,≤

−

) −

(

,≤

)] .

t ) ≤

Since Pr

(

t max )

and Pr

(

,≤

−

) ≤

(

,≤

)

,wehave Pr

(

) ,≤

) ≥

T (

(

By replacing each tuple/rule-tuple in T

) ,≤

)

with t max , we obtain a set of tuples

with the same probability p max , whose subset probabilities follows the binomial

distribution F

(

)

. Thus, the left side of Inequality 5.3 is proved.

The right side of Inequality 5.3 can be proved similarly.

(

k ; N

p max )

Moreover, Hoeffding [183] gave the following bound.

Theorem 5.15 (Extrema [183]). For a tuple t

∈

T and its compressed dominant set

t ≺ f t

(

)

, let

μ = ∑

(

)

. Then,

t ∈

(

)

, N )

∑

0 Pr

(

) ,

) ≤

(

k ; N

when 0

≤

≤ μ −

1 ; and

k j

, N )

0 Pr

(

) ,

) ≥

(

k ; N

when

μ ≤

≤

∑

Ranking Queries on Uncertain Data

Search WWH ::

Custom Search

Home