Probabilistic Ranking Queries on Uncertain Data - Ranking Queries on Uncertain Data

Database Reference

In-Depth Information

T,Pr k Q , T (

Pr k Q , S t (

Theorem 5.1 (The dominant set property). For a tuple t

∈

)

where Pr k Q , T (

and Pr k Q , S t (

)

are the top-k probabilities of t computed using tuples

in T and in S t , respectively.

Proof. The theorem follows with the definition of top- k probability directly.

Using the dominant set property, we scan the tuples in T in the ranking order and

derive the rank- k probabilities for each tuple t

T based on the tuples preceding t .

Generation rules involving multiple tuples are handled by the rule-tuple compres-

sion technique developed later in this section.

∈

5.1.2 The Basic Case: Independent Tuples

We start with the basic case where all tuples are independent. Let L

t n be

the list of all tuples in table T in the ranking order. Then, in a possible world W ,a

tuple t i ∈

t 1

···

(

≤

)

is ranked at the j -th

(

)

position if and only if exactly

(

−

)

tuples in the dominant set S t i = {

t 1 ,...,

t i − 1 }

also appear in W . The subset

probability Pr

(

S t i ,

)

is the probability that j tuples in S t i

appear in possible worlds.

Trivially, we have Pr

(

1 and Pr

(

0 for 0

≤

n . Then, the rank- k

probability can be computed as

(

t i ,

(

t i )

(

S t i ,

−

)

Moreover, the top- k probability of t i is given by

j = 1 Pr ( t i , j )= Pr ( t i )

j = 1 Pr ( S t i , j − 1 )

Pr k

(

t i )=

k ,wehave Pr k

Particularly, when i

The following theorem can be used to compute the subset probability values

efficiently.

≤

(

t i )=

(

t i )

Theorem 5.2 (Poisson binomial recurrence). In the basic case, for 1

≤

≤|

i j

1. Pr

(

S t i ,

(

S t i − 1 ,

)(

−

(

t i )) = ∏

(

−

(

t i ))

;

2. Pr

(

S t i ,

(

S t i − 1 ,

−

)

(

t i )+

(

S t i − 1 ,

)(

−

(

t i ))

Proof. In the basic case, all tuples are independent. The theorem follows with the

basic probability principles. This theorem is also called the Poisson binomial recur-

rence in [103].

Ranking Queries on Uncertain Data

Search WWH ::

Custom Search

Home