Introduction - Ranking Queries on Uncertain Data

Database Reference

In-Depth Information

To address Challenge 3, we develop three categories of efficient query answering

methods for each of the proposed ranking queries on uncertain data.

First, we integrate statistical principles and scalable computational techniques to

compute the exact answers to queries. Second, we develop efficient randomized

algorithms to estimate the answers to ranking queries. Third, we devise efficient

approximation methods based on the distribution characteristics of answers to

ranking queries.

A software package including the implementation of our algorithms and the prob-

abilistic data generator was released to public in March, 2008, and is downloadable

at http://www.cs.sfu.ca/ ˜ jpei/Software/PTKLib.rar .

1.4 Organization of the Topic

The remainder of the topic is organized as follows.

•

In Chapter 2, we formulate the uncertain data models and probabilistic ranking

queries that will be studied in this topic.

•

Chapter 3 reviews the related literature on uncertain data processing and princi-

ples of statistics and probability theory that will be used.

•

In Chapter 4, we introduce the top- k typicality queries on uncertain data, which

find the top- k most typical instances for an uncertain object. We answer two

fundamental questions. First, given an uncertain object with a large number of

instances, how can we model the typicality of each instance? Second, how to

efficiently and effectively select the most representative instances for the object?

This is essentially the problem of ranking instances within an uncertain object.

•

In Chapter 5, we discuss probabilistic ranking queries on probabilistic databases,

which select the instances in different uncertain objects whose probabilities of

being ranked top- k are high. Although it is an extension of the problem in Chap-

ter 4, the query evaluation techniques are significantly different.

•

We extend probabilistic ranking queries from static data to dynamic data in Chap-

ter 6. The objective is to continuously report the answers to a probabilistic rank-

ing query on a set of uncertain data streams, as illustrated in the second ap-

plication scenario in Example 1.1. We develop stream-specific query evaluation

methods that are highly space efficient.

•

In Chapter 7, we introduce inter-object dependencies among uncertain data ob-

ject and study probabilistic ranking queries on a set of dependent uncertain ob-

jects, which can find important applications in data integration. We show that the

model is a special case of Markov Random Fields. Moreover, we develop effi-

cient methods to evaluate ranking queries on the proposed uncertain data model.

•

In Chapter 8, we extend the probabilistic ranking queries to uncertain road net-

works, where the weights of each edge in the network is an uncertain object. We

want to select the paths having high confidences of being ranked top- k in terms

Ranking Queries on Uncertain Data

Search WWH ::

Custom Search

Home