Database Reference
In-Depth Information
To address Challenge 3, we develop three categories of efficient query answering
methods for each of the proposed ranking queries on uncertain data.
First, we integrate statistical principles and scalable computational techniques to
compute the exact answers to queries. Second, we develop efficient randomized
algorithms to estimate the answers to ranking queries. Third, we devise efficient
approximation methods based on the distribution characteristics of answers to
ranking queries.
A software package including the implementation of our algorithms and the prob-
abilistic data generator was released to public in March, 2008, and is downloadable
at http://www.cs.sfu.ca/ ˜ jpei/Software/PTKLib.rar .
1.4 Organization of the Topic
The remainder of the topic is organized as follows.
In Chapter 2, we formulate the uncertain data models and probabilistic ranking
queries that will be studied in this topic.
Chapter 3 reviews the related literature on uncertain data processing and princi-
ples of statistics and probability theory that will be used.
In Chapter 4, we introduce the top- k typicality queries on uncertain data, which
find the top- k most typical instances for an uncertain object. We answer two
fundamental questions. First, given an uncertain object with a large number of
instances, how can we model the typicality of each instance? Second, how to
efficiently and effectively select the most representative instances for the object?
This is essentially the problem of ranking instances within an uncertain object.
In Chapter 5, we discuss probabilistic ranking queries on probabilistic databases,
which select the instances in different uncertain objects whose probabilities of
being ranked top- k are high. Although it is an extension of the problem in Chap-
ter 4, the query evaluation techniques are significantly different.
We extend probabilistic ranking queries from static data to dynamic data in Chap-
ter 6. The objective is to continuously report the answers to a probabilistic rank-
ing query on a set of uncertain data streams, as illustrated in the second ap-
plication scenario in Example 1.1. We develop stream-specific query evaluation
methods that are highly space efficient.
In Chapter 7, we introduce inter-object dependencies among uncertain data ob-
ject and study probabilistic ranking queries on a set of dependent uncertain ob-
jects, which can find important applications in data integration. We show that the
model is a special case of Markov Random Fields. Moreover, we develop effi-
cient methods to evaluate ranking queries on the proposed uncertain data model.
In Chapter 8, we extend the probabilistic ranking queries to uncertain road net-
works, where the weights of each edge in the network is an uncertain object. We
want to select the paths having high confidences of being ranked top- k in terms
Search WWH ::




Custom Search