Database Reference
In-Depth Information
n
i
1 X i is the total number of successes in n independent Bernoulli
trials. X follows a binomial distribution for identical Bernoulli trials, and a Poisson-
binomial distribution for Poisson trials [103]. The exact distribution of Pr
The sum X
= ∑
=
(
X
=
i
)
can be calculated recursively using the Poisson binomial recurrence [103].
Given an uncertain table, a generation rule can be viewed as a Bernoulli trial, if
we consider the appearance of a tuple as a “success”. The probability of a rule is
the success probability of the corresponding trial. The probability of a tuple t to be
ranked top- k is the probability that t appears and there are fewer than k successes
appear before t .
3.4.2.1 How Is Our Study Related?
Some results of Poisson trials can be used in answering probability threshold top- k
queries. However, the study of Poisson trials in probability theory does not address
the concerns on efficient query answering for large databases. Moreover, multi-tuple
generation rules pose new challenges.
In our study, we develop several techniques to process generation rules effi-
ciently. Pruning rules are also proposed to achieve early stop without scanning the
whole table, which significantly improves the efficiency in query answering.
3.5 Uncertain Streams
In this section, we review the existing work highly related to ranking query moni-
toring on uncertain data streams and point out the differences.
3.5.1 Continuous Queries on Probabilistic Streams
To the best of our knowledge, [104, 105, 106, 43] are the only existing studies on
continuous queries on probabilistic data streams, which are highly related to our
study.
In [104], a probabilistic data stream is a (potentially infinite) sequence of un-
certain tuples, where each tuple is associated with a membership probability p
(
, meaning that the tuple takes a probability p to appear in an instance
(i.e., a possible world) of the probabilistic stream. It is assumed that tuples are in-
dependent from each other. Conventional stream sketching methods are extended
to such probabilistic data streams to approximate answers to complex aggregate
queries.
Jayram et al. [105, 106] adopted a different probabilistic data stream model. A
probabilistic data stream contains a (potentially infinite) sequence of uncertain ob-
jects, where each uncertain object is represented by a set of instances and each
0
<
p
1
)
Search WWH ::




Custom Search