Probabilistic Ranking Queries on Uncertain Data - Ranking Queries on Uncertain Data

Database Reference

In-Depth Information

To proof Theorem 5.12, We first prove t

L i <

MR p (

)

. Following the defini-

−

tion of PRist ,wehave Pr t . L i

(

) ≤

. Thus, MR i − 1

(

) ≥

L i , which follows from

−

h <

Lemma 5.3. Moreover, since

p ,wehave MR i − 1

(

) ≤

MR p (

)

, which follows

from Lemma 5.2. Thus, t

L i <

MR p (

)

Similarly, MR p (

) ≤

U i can be proved.

Step 2: Pruning and Validating Given a top-

(

)

query, let b i be the prob-interval

U as a pruning condition. Any tuple t

whose lower rank in b i is at least k can be pruned. Moreover, for any tuple t ,ifthe

upper rank of t in b i is not greater than l lower ranks of other tuples in b i , then t can

be validated.

containing p . We use the l -th rank k in list b i

Theorem 5.13 (Answering Top-

queries). Let T be a set of uncertain tuples

indexed by PRist with granularity parameter h. For a tuple t

(

)

∈

S, a top-

(

)

query,

and prob-interval b i such that p

∈

b i :

l, then any tuple t such that

1. Pruning : Let T 1 = {

t x |

t x .

U i ≤

U i ∧

t x ∈

}

.If

T 1 |≥

t .

U i is not an answer to the query;

2. Validating : Let T 2 = {

L i ≥

t y |

t y .

L i ≤

U i ∧

t y ∈

}

.If

T 2 |≤

l, then t is an answer to

the query.

Proof. To prove the first item, we only need to show that there are at least l tuples

whose p -ranks are smaller than MR p (

t )

. For any tuple t x ∈

T 1 ,wehave MR p (

t x ) ≤

U i , which is guaranteed by Theorem 5.12. Similarly, for any tuple t such

t x .

U i ≤

that t .

t ) ≥

t )

L i ≥

U i ,wehave MR p (

U i . Therefore, MR p (

t x ) ≤

MR p (

. Since

l , there are at least l tuples like t x . So the first item holds.

To prove the second item, we only need to show that there are fewer than l tuples

whose p -ranks are smaller than MR p (

T 1 |≥

. For any tuple t ∈

T 2 , since t .

)

L i >

U i ,

t ) >

(

)

MR p

. Therefore, only the tuples in T 2 may have smaller p -ranks than

MR p

(

)

|≤

. Since

T 2

l , the second item holds.

Step 3: Evaluating For any tuple that can neither be validated nor be pruned, we

need to calculate their exact p -ranks. To calculate the p -rank of tuple t , we calculate

the top-1, top-2,

···,

top- i probabilities for each rank i until we find the first rank k

such that Pr k

p .

The complexity is analyzed as follows. Let d 1 be number of tuples that cannot

be pruned in the pruning and validation phase. In the evaluating phase, the p -ranks

of the tuples that cannot be pruned or validated are calculated. For each such tuple,

its p -rank can be m in the worst case, where m is the number of rules in T . Then,

calculating the p -rank for each tuple takes O

(

) ≥

m 2

time. Let d 2 be the number of

tuples that cannot be pruned or validated. The evaluating step takes O

(

)

m 2 d 2 )

(

time.

m 2 d 2 )

The overall complexity is O

(

d 1 +

Ranking Queries on Uncertain Data

Search WWH ::

Custom Search

Home