Probabilistic Ranking Queries on Uncertain Data - Ranking Queries on Uncertain Data

Database Reference

In-Depth Information

probability of t needs to be recomputed after removing L R , which requires a cost

of 1.

Therefore, in any of the three cases, the cost of lazy reordering is not more than

the cost of aggressive reordering. The conclusion folows.

5.2.3 Pruning Techniques

So far, we implicitly have a requirement: all tuples in T are scanned in the ranking

order. However, a probabilistic ranking query or reverse query is interested in only

those tuples passing the query requirement. Can we avoid retrieving or checking all

tuples satisfying the query predicates?

Some existing methods such as the well known TA algorithm [4] can retrieve

in batch tuples satisfying the predicate in the ranking order. Using such a method,

we can retrieve tuples in T progressively in the ranking order. Now, the problem

becomes how we can use the tuples seen so far to prune some tuples ranked lower

in the ranking order.

Consider rank parameter k and probability threshold p . We give four pruning

rules: Theorems 5.4 and 5.5 can avoid checking some tuples that cannot satisfy

the probability threshold, and Theorems 5.6 and 5.7 specify stopping conditions.

The tuple retrieval method ( e.g., an adaption of the TA algorithm [4]) uses the

pruning rules in the retrieval. Once it can determine all remaining tuples in T fail

the probability threshold, the retrieval can stop.

Please note that we still have to retrieve a tuple t failing the probability threshold

if some tuples ranked lower than t may satisfy the threshold, since t may be in the

compressed dominant sets of those promising tuples.

T, Pr k

Theorem 5.4 (Pruning by membership probability). For a tuple t

∈

(

) ≤

. Moreover, if t is an independent tuple and Pr k

(

)

(

) <

p, then

1. for any independent tuple t such that t

f t and Pr

t ) ≤

,Pr k

t ) <

p; and

2. for any multi-tuple rule R such that t is ranked higher than all tuples in R and

(

)

(

,Pr k

t ) <

p for any t ∈

(

) ≤

(

)

(

To use Theorem 5.4, we maintain the largest membership probability p member of

all independent tuples and rule-tuples for completed rules checked so far whose top-

k probability values fail the probability threshold. All tuples identified by the above

pruning rule should be marked failed.

A tuple involved in a multi-tuple rule may be pruned using the other tuples in the

same rule.

Theorem 5.5 (Pruning by tuples in the same rule). For tuples t and t in the same

multi-tuple rule R, if t

f t ,Pr

t )

, and Pr k

p, then Pr k

t ) <

(

) ≥

(

) <

(

Based on the above pruning rule, for each rule R open with respect to the current

tuple, we maintain the largest membership probability of the tuples seen so far in R

Ranking Queries on Uncertain Data

Search WWH ::

Custom Search

Home