Related Work - Ranking Queries on Uncertain Data

Database Reference

In-Depth Information

instance carries a membership probability. An uncertain object arrives in whole at a

time and does not change after the arrival. In other words, uncertain objects do not

evolve over time. New uncertain objects keep arriving. Several one pass streaming

algorithms are proposed to estimate the statistical aggregates of the probabilistic

data.

Most recently, Kanagal and Deshpande [43] proposed a probabilistic sequence

model that considers the temporal and spatial correlations among data. Given a set

of uncertain attributes

(

,···,

)

≤

A 1

A m

, each uncertain attribute A i (1

i

m ) is a dis-

crete random variable in domain dom

(

A i

)

whose distribution is evolving over time.

A probabilistic sequence contains, for each time instant t , an instance

v t 1 ,···,

v t m )

(

, where each v i

for

)

with certain probability distribution. It is a Markov sequence since the random vari-

ables at t only depends on the random variables at t

(

A 1

,···,

A m

)

(1

≤

i

≤

m ) is a random variable in domain dom

(

A i

1. Graphical models are used

to describe the correlations among the random variables in two consecutive instants.

Query answering is considered as inferences over the graphical models.

−

3.5.1.1 How Is Our Study Related?

Our study is different from [104, 105, 106, 43] in following two important aspects.

First, the uncertain stream model proposed in this topic is the substantially differ-

ent from the ones proposed before. In the probabilistic sequence model proposed

in [43], each element in the stream is a random variable (distribution). While we

model an uncertain stream as a series of sample instances generated by a temporal

random variable. The set of random variables (i.e., uncertain objects) are fixed. The

distributions of those random variables evolve over time. Our model handles some

application scenarios that are not covered by the models in [104, 105, 106, 43].

Second, we focus on continuous probabilistic threshold top- k queries on slid-

ing windows, a novel type of queries on uncertain data streams that have not been

addressed before. [104, 105, 106] deal with aggregates on a whole stream. The op-

erators discussed in [43] cannot be directly used to answer continuous probabilistic

threshold top- k queries.

3.5.2 Continuous Ranking and Quantile Queries on Data Streams

A rank or quantile query is to find a data entry with a given rank against a monotonic

order specified on the data. Rank queries have several equivalent variations [107,

108, 109] and play important roles in many data stream applications.

It has been shown in [105] that an exact computation of rank queries requires

memory size linear to the size of a data set by any one-scan technique, which may

be impractical in on-line data stream computation where streams are massive in size

and fast in arrival speed. Approximately computing rank queries over data streams

has been investigated in the form of quantile computation.

Ranking Queries on Uncertain Data

Search WWH ::

Custom Search

Home