Top-k Typicality Queries on Uncertain Data - Ranking Queries on Uncertain Data

Database Reference

In-Depth Information

Algorithm 4.1 ExactTyp( OS , k )

Input: an uncertain object O = { o 1 ,..., o n } and positive integer k

Output: the k instances with the highest simple typicality values

Method:

1: for all instance o ∈ O do

2: set T ( o , O )= 0

3: end for

4: for i =

1to n do

for j

1to n do

d ( o i , o j )

e −

nh √ 2

w =

2 h 2

(

o i

(

o i

(

o j

(

o j

9: end for

10: end for

11: return the top- k instances according to T ( o , O )

to a special case of the top-1 simple typicality query problem. As so far no better

than quadratic algorithm has been found for exact solutions to the general discrete

1-median problem (except in L 1 metric space), it is challenging to find a better than

quadratic algorithm for computing exact answers to general top- k typicality queries.

We now present Algorithm 4.1, a straightforward method that computes the exact

answer to a top- k simple typicality query. It computes the exact simple typicality

for each instance using two nested loops, and then selects the top- k instance. The

complexity of Algorithm 4.1 is O

is the number of instances in

O . Quadratic algorithms are often too costly for online queries on large databases,

while good approximations of exact answers are good enough for typicality analysis.

This motivates our development of approximation algorithms.

( |

)

, where

4.1.3 A Randomized Tournament Algorithm

Inspired by the randomized tournament method [78] for the discrete 1-median prob-

lem, we propose a randomized tournament algorithm for answering top- k simple

typicality queries as follows.

Let t be a small integer, called the tournament group size . To find the most typ-

ical instances in object O of n instances, we partition the instances into

groups

randomly such that each group has t instances. For each group, we use the exact al-

gorithm to find the instance that has the largest simple typicality value in the group.

Only the winner instances in the groups are sent to the next round.

The winners of the previous round are again partitioned randomly into groups

such that each group contains t instances. The most typical instance in each group

is selected and sent to the next round. The tournament continues until only one

instance is selected as the winner. The final winner is an approximation of the most

typical instance.

Ranking Queries on Uncertain Data

Search WWH ::

Custom Search

Home