Related Work - Ranking Queries on Uncertain Data

Database Reference

In-Depth Information

A

φ

-quantile

(

0

< φ ≤

1

)

of a collection of N data elements is the element with

rank

against a monotonic order specified on the data set. The main paradigm

is to continuously and efficiently maintain a small data structure (sketch/summary)

in space over data elements for online queries. It has been shown in [110, 111, 112,

113] that a space-efficient

φ

N

φ

-approximation quantile sketch can be maintained so

, it is always possible to find an element at rank r with the uni-

form precision guarantee

that, for a quantile

φ

r − φ

≤ ε

N . Due to the observation that many real

data sets often exhibit skew towards heads (or tails depending on a given monotonic

order), relative rank error (or biased) quantile computation techniques have been

recently developed [114, 107, 109], which give better rank error guarantees towards

heads.

Top- k queries have been extended to data streams. In [5], Mouratidis el al. study

the problem of continuous monitoring top- k queries over sliding windows. Very

recently, [115] improves the performance of the algorithms.

N

3.5.2.1 How Is Our Study Related?

All the existing studies on continuous ranking or quantile queries on data streams

do not consider uncertain data. Those methods cannot be extended to probabilistic

threshold top- k queries on uncertain data directly due to the complexity of possible

worlds semantics. In this topic, we investigate native methods for uncertain data

streams.

3.5.3 Continuous Sensor Stream Monitoring

Sensor stream monitoring focuses on maintaining the answers to deterministic

queries in sensor networks, while reducing the energy consumption as much as pos-

sible. Deshpande et al. [116] built a correlation-aware probabilistic model from the

stored and current data in a sensor network, and use the probabilistic model to an-

swer SQL queries. Only approximate answers with certain confidence bounds are

provided, but the cost of data maintenance and query answering is significantly re-

duced. More specifically, Liu et al. [117] studied Min/Max query monitoring over

distributed sensors. In their scenario, queries are submitted to a central server, and

the major cost in query answering is the communication cost between the central

server and distributed sensors. The authors model the reading of each sensor as

a random variable, whose probability distribution can be obtained from historical

data. Those distributions are used to estimate the answer to any Min/Max query.

The server also contacts a small number of sensors for their exact readings, in order

to satisfy the user specified error tolerance. [118] considers the applications where

multiple sensors are deployed to monitor the same region. A sampling method is

used to answer continuous probabilistic queries. The values of sensors that have

little effect on the query results are sampled at a lower rate.

Ranking Queries on Uncertain Data

Search WWH ::

Custom Search

Home