Top-k Typicality Queries on Uncertain Data - Ranking Queries on Uncertain Data

Database Reference

In-Depth Information

4.4.2 A Randomized Tournament Method

A top- k representative typicality query can be answered using the randomized tour-

nament method.

At the beginning, the answer set A is empty, so the randomized tournament

method works exactly the same as finding the most typical instance using the ran-

domized tournament method as described in Section 4.1.3. The winner instance of

the tournament is added to A .

To compute the i -th

instance with the highest approximate representative

typicality score, a randomized tournament is conducted from bottom up, similar to

finding the first answer in A . The only difference is that the representative typicality

score of each instance in each group is computed, instead of the simple typicality

score. The instance with the maximal representative typicality score in each group

is the winner and is sent to the next round of tournament. The final winner is an

approximation of the i -th most representatively typical instance, and is added to A .

A top- k representative typicality query can be answered by k randomized tour-

naments. To ensure a higher accuracy, we can run each tournament several times,

and pick the winner instance with the highest representative typicality score on the

whole data set.

The complexity of the randomized tournament to find the i -th instance

(

i

>

1

)

(

i

≤

k

)

with the highest representative typicality score is O

, where v is the number of

times the tournament is run, t is the group size, and n is the size of the data set. This

is because, finding the instance with the highest representative typicality score in

each group takes O

(

vtn

)

t 2

n

t )

groups in total. To answer a top-

k representative typicality query, k randomized tournaments need to be conducted.

Therefore, the overall complexity is O

(

)

time, and there are O

(

kvtn

)

.

4.4.3 Local Typicality Approximation Methods

The locality property in simple typicality approximation can be extended to address

the representative typicality approximation.

Let A be the current reported answer set. The local group typicality of A is com-

puted by only considering the instances in the

σ

-neighborhood of o

∈

A . The intu-

ition is, if an instance is not in the

-neighborhood of o , then the contribution from

o to this instance is small and can be ignored.

σ

Definition 4.5 (Local group typicality). Given an uncertain object O , a neighbor-

hood threshold

be the random vector that

generates the samples O , the local group typicality of A is

σ

and a subset of instances A

⊂

O , let

X

,X , σ)= o ∈ A LT ( o ,{ o },X D ( o , A ) , σ ) · Pr ( N )

LGT

(

A

Ranking Queries on Uncertain Data

Search WWH ::

Custom Search

Home