Related Work - Ranking Queries on Uncertain Data

Database Reference

In-Depth Information

n

i

1 X i is the total number of successes in n independent Bernoulli

trials. X follows a binomial distribution for identical Bernoulli trials, and a Poisson-

binomial distribution for Poisson trials [103]. The exact distribution of Pr

The sum X

= ∑

=

(

X

=

i

)

can be calculated recursively using the Poisson binomial recurrence [103].

Given an uncertain table, a generation rule can be viewed as a Bernoulli trial, if

we consider the appearance of a tuple as a “success”. The probability of a rule is

the success probability of the corresponding trial. The probability of a tuple t to be

ranked top- k is the probability that t appears and there are fewer than k successes

appear before t .

3.4.2.1 How Is Our Study Related?

Some results of Poisson trials can be used in answering probability threshold top- k

queries. However, the study of Poisson trials in probability theory does not address

the concerns on efficient query answering for large databases. Moreover, multi-tuple

generation rules pose new challenges.

In our study, we develop several techniques to process generation rules effi-

ciently. Pruning rules are also proposed to achieve early stop without scanning the

whole table, which significantly improves the efficiency in query answering.

3.5 Uncertain Streams

In this section, we review the existing work highly related to ranking query moni-

toring on uncertain data streams and point out the differences.

3.5.1 Continuous Queries on Probabilistic Streams

To the best of our knowledge, [104, 105, 106, 43] are the only existing studies on

continuous queries on probabilistic data streams, which are highly related to our

study.

In [104], a probabilistic data stream is a (potentially infinite) sequence of un-

certain tuples, where each tuple is associated with a membership probability p

(

, meaning that the tuple takes a probability p to appear in an instance

(i.e., a possible world) of the probabilistic stream. It is assumed that tuples are in-

dependent from each other. Conventional stream sketching methods are extended

to such probabilistic data streams to approximate answers to complex aggregate

queries.

Jayram et al. [105, 106] adopted a different probabilistic data stream model. A

probabilistic data stream contains a (potentially infinite) sequence of uncertain ob-

jects, where each uncertain object is represented by a set of instances and each

0

<

p

≤

1

)

Ranking Queries on Uncertain Data

Search WWH ::

Custom Search

Home