Database Reference
In-Depth Information
instance carries a membership probability. An uncertain object arrives in whole at a
time and does not change after the arrival. In other words, uncertain objects do not
evolve over time. New uncertain objects keep arriving. Several one pass streaming
algorithms are proposed to estimate the statistical aggregates of the probabilistic
data.
Most recently, Kanagal and Deshpande [43] proposed a probabilistic sequence
model that considers the temporal and spatial correlations among data. Given a set
of uncertain attributes
(
,···,
)
A 1
A m
, each uncertain attribute A i (1
i
m ) is a dis-
crete random variable in domain dom
(
A i
)
whose distribution is evolving over time.
A probabilistic sequence contains, for each time instant t , an instance
v t 1 ,···,
v t m )
(
, where each v i
for
)
with certain probability distribution. It is a Markov sequence since the random vari-
ables at t only depends on the random variables at t
(
A 1
,···,
A m
)
(1
i
m ) is a random variable in domain dom
(
A i
1. Graphical models are used
to describe the correlations among the random variables in two consecutive instants.
Query answering is considered as inferences over the graphical models.
3.5.1.1 How Is Our Study Related?
Our study is different from [104, 105, 106, 43] in following two important aspects.
First, the uncertain stream model proposed in this topic is the substantially differ-
ent from the ones proposed before. In the probabilistic sequence model proposed
in [43], each element in the stream is a random variable (distribution). While we
model an uncertain stream as a series of sample instances generated by a temporal
random variable. The set of random variables (i.e., uncertain objects) are fixed. The
distributions of those random variables evolve over time. Our model handles some
application scenarios that are not covered by the models in [104, 105, 106, 43].
Second, we focus on continuous probabilistic threshold top- k queries on slid-
ing windows, a novel type of queries on uncertain data streams that have not been
addressed before. [104, 105, 106] deal with aggregates on a whole stream. The op-
erators discussed in [43] cannot be directly used to answer continuous probabilistic
threshold top- k queries.
3.5.2 Continuous Ranking and Quantile Queries on Data Streams
A rank or quantile query is to find a data entry with a given rank against a monotonic
order specified on the data. Rank queries have several equivalent variations [107,
108, 109] and play important roles in many data stream applications.
It has been shown in [105] that an exact computation of rank queries requires
memory size linear to the size of a data set by any one-scan technique, which may
be impractical in on-line data stream computation where streams are massive in size
and fast in arrival speed. Approximately computing rank queries over data streams
has been investigated in the form of quantile computation.
Search WWH ::




Custom Search