Probabilistic Ranking Queries on Uncertain Data - Ranking Queries on Uncertain Data - page 93

Database Reference

In-Depth Information

rule−tuple compression

ti

tR

ti

generation rule R

generation rule−tuple, Pr(tR)=Pr(R)

Case 2: ti is ranked lower than all tuples in R

ti

rule−tuple compression

tR_left

ti

generation rule R

generation rule R

Case 3 (Subcase 1): ti is ranked between tuples in R and ti is not in R

Fig. 5.1 Computing Pr k

(

t i

)

for one tuple t i .

Corollary 5.2 (Tuple in generation rule). For a tuple t

∈

R such that

|

R

| >

1 ,

Pr k Q , T (

Pr k Q , T (

where uncertain table T =(

t |

t ∈

.

Proof. Since the tuples in R are mutually exclusive, the probability that one tuple in

R appears is the sum of the membership probability of each tuple in R . Therefore,

the corollary holds.

t

)=

t

)

S t i −{

R

} ) ∪{

t

}

For a tuple t and its dominant set S t , we can check t against the multi-tuple

generation rules one by one. Each multi-tuple generation rule can be handled by

one of the above two cases as illustrated in Figure 5.1, and the dependent tuples

in S t can be either compressed into some generation rule-tuples or removed due to

the involvement in the same generation rule as t . After the generation rule-tuple

compression, the resulting set is called the compressed dominant set of t , denoted

by T

(

t

)

. Based on the above discussion, for a tuple t

∈

T , all tuples in T

(

t

) ∪{

t

}

are independent, Pr k Q , T (

Pr k Q , T ( t ) ∪{ t } (

t

)=

t

)

. We can apply Theorem 5.2 to calculate

Pr k

(

t

)

by scanning T

(

t

)

once.

Example 5.1 (Generation rule-tuple compression). Consider a list of tuples t 1 ,···,

t 11

in the ranking order. Suppose we have two multi-tuple generation rules: R 1 =

t 2 ⊕

t 7 . Let us consider how to compute Pr 3

and Pr 3

t 4 ⊕

t 9 and R 2 =

t 5 ⊕

(

t 6 )

(

t 7 )

.

Tuple t 6 is ranked between tuples in R 1 , but t 6 ∈

R 1 . The first subcase of Case 3

should be applied. Thus, we compress R 1 left = {

t 2

,

t 4

}

into an generation rule-

(

)=

(

)+

(

)

tuple t 2 , 4 with membership probability Pr

t 2 , 4

Pr

t 2

Pr

t 4

. Similarly, t 6 is

also ranked between tuples in R 2 and t 6

∈

R 2 , but R 2 left = {

}

t 5

. The compression

does not remove any tuple. After the compression, T

(

t 6

)= {

t 1

,

t 2 , 4

,

t 3

,

t 5

}

. Since the

are independent, we can apply Theorem 5.2 to compute Pr 3

tuples in T

(

t 6

) ∪{

t 6

}

(

t 6

)

using T

.

Since t 7 ∈

(

t 6 )

R 2 , the tuples in R 2 except for t 7 itself should be removed. Thus, we

have T

(

t 7 )= {

t 1 ,

t 2 , 4 ,

t 3 ,

t 6 }

.

We can sort all tuples in T into a sorted list L in the ranking order. For each tuple

t i , by one scan of the tuples in L before t i , we obtain the compressed dominant set

Next Page

Ranking Queries on Uncertain Data

Search WWH ::

Custom Search

Home