Database Reference
In-Depth Information
A
φ
-quantile
(
0
< φ
1
)
of a collection of N data elements is the element with
rank
against a monotonic order specified on the data set. The main paradigm
is to continuously and efficiently maintain a small data structure (sketch/summary)
in space over data elements for online queries. It has been shown in [110, 111, 112,
113] that a space-efficient
φ
N
φ
-approximation quantile sketch can be maintained so
, it is always possible to find an element at rank r with the uni-
form precision guarantee
that, for a quantile
φ
r φ
ε
N . Due to the observation that many real
data sets often exhibit skew towards heads (or tails depending on a given monotonic
order), relative rank error (or biased) quantile computation techniques have been
recently developed [114, 107, 109], which give better rank error guarantees towards
heads.
Top- k queries have been extended to data streams. In [5], Mouratidis el al. study
the problem of continuous monitoring top- k queries over sliding windows. Very
recently, [115] improves the performance of the algorithms.
N
3.5.2.1 How Is Our Study Related?
All the existing studies on continuous ranking or quantile queries on data streams
do not consider uncertain data. Those methods cannot be extended to probabilistic
threshold top- k queries on uncertain data directly due to the complexity of possible
worlds semantics. In this topic, we investigate native methods for uncertain data
streams.
3.5.3 Continuous Sensor Stream Monitoring
Sensor stream monitoring focuses on maintaining the answers to deterministic
queries in sensor networks, while reducing the energy consumption as much as pos-
sible. Deshpande et al. [116] built a correlation-aware probabilistic model from the
stored and current data in a sensor network, and use the probabilistic model to an-
swer SQL queries. Only approximate answers with certain confidence bounds are
provided, but the cost of data maintenance and query answering is significantly re-
duced. More specifically, Liu et al. [117] studied Min/Max query monitoring over
distributed sensors. In their scenario, queries are submitted to a central server, and
the major cost in query answering is the communication cost between the central
server and distributed sensors. The authors model the reading of each sensor as
a random variable, whose probability distribution can be obtained from historical
data. Those distributions are used to estimate the answer to any Min/Max query.
The server also contacts a small number of sensors for their exact readings, in order
to satisfy the user specified error tolerance. [118] considers the applications where
multiple sensors are deployed to monitor the same region. A sampling method is
used to answer continuous probabilistic queries. The values of sensors that have
little effect on the query results are sampled at a lower rate.
Search WWH ::




Custom Search