Related Work - Ranking Queries on Uncertain Data

Database Reference

In-Depth Information

with tuple level uncertainty, where each tuple in a table is associated with a mem-

bership probability. Kimelfeld and Sagiv [58] studied the maximal join queries on

probabilistic data, where only the answers whose probabilities are greater than a

threshold and are not contained by any other output answers are returned.

3.1.3 Indexing Uncertain Data

There are two categories of indexes on uncertain data. The first category is for un-

certain numeric data, such as sensor values or locations of moving objects. Tao et

al. [20, 59] proposed a U-tree index to facilitate probabilistic range queries on uncer-

tain objects represented by multi-dimensional probability density functions. Ljosa

et al. [60] developed an APLA-tree index for uncertain objects with arbitrary proba-

bility distributions. An APLA for each object is an adaptive piecewise linear approx-

imation and can be regarded as a time series. All those time series are organized by

an APLA-tree in a hierarchical manner. Besides, Cheng et al. [19] developed PTI,

probability thresholding index, to index uncertain objects with one dimensional un-

certain values. Bohm et al. [61] developed an index for uncertain objects whose

probability density functions are Gaussian functions N

(μ , σ )

, by treating each ob-

ject as a two dimensional points

and indexing the points in an R-tree.

The second category is for uncertain categorical data, such as RFID data or text

labels. For example, Singh et al. [62] extended the inverted index and signature tree

to index uncertain categorical data.

(μ , σ )

3.2 Ranking (Top- k ) Queries

There are numerous existing methods for answering top- k queries on static data. The

threshold algorithm (TA) [6] is one of the fundamental algorithms. TA first sorts the

values in each attribute and then scans the sorted lists in parallel. A “stopping value”

is maintained, which acts as a threshold to prune the tuples in the rest of the lists

if they cannot have better scores than the threshold. Several variants of TA have

been proposed, such as [4]. [29] provides a comprehensive survey about the ranking

queries and evaluation algorithms. The recent developments and extensions of top- k

query answering include using views to answer top- k queries efficiently [3], remov-

ing redundancy in top- k patterns [63], applying multidimensional analysis in top- k

queries [64], continuous monitoring of top-k queries over a sliding window [5], and

so forth.

Search WWH ::

Custom Search

Home