Database Reference
In-Depth Information
with tuple level uncertainty, where each tuple in a table is associated with a mem-
bership probability. Kimelfeld and Sagiv [58] studied the maximal join queries on
probabilistic data, where only the answers whose probabilities are greater than a
threshold and are not contained by any other output answers are returned.
3.1.3 Indexing Uncertain Data
There are two categories of indexes on uncertain data. The first category is for un-
certain numeric data, such as sensor values or locations of moving objects. Tao et
al. [20, 59] proposed a U-tree index to facilitate probabilistic range queries on uncer-
tain objects represented by multi-dimensional probability density functions. Ljosa
et al. [60] developed an APLA-tree index for uncertain objects with arbitrary proba-
bility distributions. An APLA for each object is an adaptive piecewise linear approx-
imation and can be regarded as a time series. All those time series are organized by
an APLA-tree in a hierarchical manner. Besides, Cheng et al. [19] developed PTI,
probability thresholding index, to index uncertain objects with one dimensional un-
certain values. Bohm et al. [61] developed an index for uncertain objects whose
probability density functions are Gaussian functions N
, σ )
, by treating each ob-
ject as a two dimensional points
and indexing the points in an R-tree.
The second category is for uncertain categorical data, such as RFID data or text
labels. For example, Singh et al. [62] extended the inverted index and signature tree
to index uncertain categorical data.
, σ )
3.2 Ranking (Top- k ) Queries
There are numerous existing methods for answering top- k queries on static data. The
threshold algorithm (TA) [6] is one of the fundamental algorithms. TA first sorts the
values in each attribute and then scans the sorted lists in parallel. A “stopping value”
is maintained, which acts as a threshold to prune the tuples in the rest of the lists
if they cannot have better scores than the threshold. Several variants of TA have
been proposed, such as [4]. [29] provides a comprehensive survey about the ranking
queries and evaluation algorithms. The recent developments and extensions of top- k
query answering include using views to answer top- k queries efficiently [3], remov-
ing redundancy in top- k patterns [63], applying multidimensional analysis in top- k
queries [64], continuous monitoring of top-k queries over a sliding window [5], and
so forth.
Search WWH ::




Custom Search