Related Work - Ranking Queries on Uncertain Data

Database Reference

In-Depth Information

When typicality queries are computed on graphs or Euclidean planes, some ideas

in [95, 96] may be borrowed.

3.4 Probabilistic Ranking Queries

In this section, we first review the recent proposals of ranking queries and evaluation

algorithms on uncertain data. Then, we link the problem of ranking uncertain data

to the counting principle in probability and statistics, which provides more insights

into the ranking uncertain data problem.

3.4.1 Top-k Queries on Uncertain Data

Generally, ranking queries on uncertain data can be classified into the following two

categories.

3.4.1.1 Category I: Extensions of traditional ranking queries

The first category is extensions to traditional ranking queries on certain data. That

is, given a traditional ranking query with an objective function, all tuples are ranked

based on the objective function. Since the results to the query may be different in

different possible worlds, various queries capture and summarize the results using

different criteria.

Soliman et al. [17] proposed two types of ranking queries: U-Top k queries and

U- K Ranks queries. The answer to a U-Top k query is always a top- k tuple list in

some valid possible worlds, and the exact positions of the tuples in the list are pre-

served. A U- K Ranks query finds the tuple of the highest probability at each ranking

position. The tuples in the results of a U- K Ranks query may not form a valid top- k

tuple list in a possible world, though a U- K Ranks query always returns k tuples. A

tuple may appear more than once in the answer set if it has the highest probability

values to appear in multiple ranking positions, respectively. Lian and Chen devel-

oped the spatial and probabilistic pruning techniques for U- K Ranks queries [97].

Simultaneously with our study, Yi et al. [98] proposed efficient algorithms to answer

U-Top k queries and U- K Ranks queries. Silberstein et al. [32] model each sensor in

a sensor network as an uncertain object whose values follow some unknown distri-

bution. Then, a top- k query in the sensor network returns the top- k sensors such that

the probability of each sensor whose values are ranked top- k in any timestep is the

greatest. A sampling-based method collects all values in the network as a sample at

randomly chosen timesteps, and the answer to a top- k query is estimated using the

samples.

Search WWH ::

Custom Search

Home