Introduction - Ranking Queries on Uncertain Data

Database Reference

In-Depth Information

record introduces a new dimension in ranking queries. How to leverage the proba-

bilities in ranking queries remains challenging in uncertain data analysis.

Challenge 3 How to develop efficient and scalable query processing methods?

Evaluating ranking queries on uncertain data is challenging. On the one hand, tra-

ditional ranking query processing methods cannot be directly applied since they do

not consider how to handle probabilities. On the other hand, although some stan-

dard statistical methods such as Bayesian Statistics [26] can be applied to analyzing

uncertain data in some applications, efficiency and scalability issues are usually not

well addressed.

Meanwhile, as shown in Example 1.1, uncertain data is a summary of all possi-

ble worlds. Therefore, a naıve way to answer a ranking query on uncertain data is

to apply the query to all possible worlds and summarize the answers to the query.

However, it is often computationally prohibitive to enumerate all possible worlds.

Thus, we need to develop efficient and scalable query evaluation methods for rank-

ing queries on uncertain data.

1.3 Focus of the Topic

In this topic, we discuss probabilistic ranking queries on uncertain data and address

the three challenges in Section 1.2. Specially, we focus on the following aspects.

•

We introduce three extended uncertain data models.

To address Challenge 1, we first study two basic uncertain data models, the prob-

abilistic database model and the uncertain object model , and show that the two

models are equivalent.

Then, we develop three extended uncertain object model, to address three impor-

tant application scenarios. The first extension, the uncertain data stream model ,

describes uncertain objects whose distributions evolve over time. The second

extension, the probabilistic linkage model , introduces inter-object dependencies

into uncertain objects. The third extension, the uncertain road network model ,

models the weight of each edge in road networks as an uncertain object.

• We discuss five novel problems of ranking uncertain data.

To address Challenge 2, we formulate five novel ranking problems on uncertain

data models from multiple aspects and levels.

First, from the data granularity point of view, we study the problems of ranking

instances within a single uncertain object, ranking instances among multiple un-

certain objects, ranking uncertain objects and ranking the aggregates of a set of

uncertain objects. Second, from the ranking scope point of view, we study rank-

ing queries within an uncertain object and among multiple uncertain objects.

Third, from the query type point of view, we discuss two categories of ranking

queries considering both ranking criteria and the probability constraint.

•

We discuss three categories of query processing methods.

Search WWH ::

Custom Search

Home