Probabilistic Ranking Queries on Uncertain Data - Ranking Queries on Uncertain Data

Database Reference

In-Depth Information

j = 1 Pr ( o , j ) .

Pr k

(

(2.2)

2.2.2.2 Ranking Criteria

Given a rank parameter k

,a probability

threshold top-k query ( PT-k query for short) [30, 31] finds the instances whose top- k

probabilities are no less than p .

0 and a probability threshold p

∈ (

]

Definition 2.12 (PT- k query and top-

(

)

query). Given a rank parameter k

,a probabilistic threshold top- k query ( PT-k

query for short) [30, 31] finds the instances whose top- k probabilities are no less

than p .

Alternatively, a user can use an answer set size constraint l

∈ (

]

and a probability threshold p

0 to replace the

probability threshold p and issue a top-

(

)

query [32, 33], which finds the top- l

tuples with the highest top- k probabilities.

Now, let us consider the reverse queries of PT- k queries and top-

(

)

queries.

For an instance o , given a probability threshold p

∈ (

]

, the p-rank of o is the

minimum k such that Pr k

Pr k

(

) ≥

p , denoted by MR p (

min

{

(

) ≥

}

Definition 2.13 (RT- k query and top-

(

)

query). Given a probability threshold

0, a rank threshold query ( RT-k query for

short) to retrieve the instances whose p -ranks are at most k .RT- k queries are reverse

queries of PT- k queries.

Alternatively, a user can replace the rank threshold by an answer set size con-

straint l

∈ (

]

and a rank threshold k

0 and issue a top-

(

)

query , which returns the top- l instances with the

smallest p -ranks. Clearly, top-

(

)

queries are reverse queries of top-

(

)

queries.

Interestingly, it is easy to show the following.

Corollary 2.3 (Answers to PT- k and RT- k queries). Given a set of uncertain ob-

jects S, an integer k

, the answer to a PT-k query with

rank parameter k and probability threshold p and that to a RT-k query with rank

threshold k and probability threshold p are identical.

Proof. An instance satisfying the PT- k query must have the p -rank at most k , and

thus satisfies the RT- k query. Similarly, an instance satisfying the RT- k query must

have the top- k probability at least p , and thus satisfies the PT- k query.

0 and a real value p

∈ (

]

PT- k queries and RT- k queries share the same set of parameters: a rank parameter

k and a probability threshold. Thus, as shown in Corollary 2.3, when the parameters

are the same, the results are identical. For top-

queries,

even they share the same value on the answer set size constraint l , the answers

generally may not be the same since the rank parameter and the probability threshold

select different instances.

(

)

queries and top-

(

)

Ranking Queries on Uncertain Data

Search WWH ::

Custom Search

Home