Database Reference
In-Depth Information
# Time
Samp Speed Volume Occupancy
00:01:51
30
47
575
6
00:16:51
30
48
503
5
00:31:51
30
48
503
5
00:46:51
30
49
421
4
01:01:52
30
48
274
5
01:16:52
30
42
275
14
...
Table 1.2 Data for segment SEGK 715001 for 07
/
15
/
2001 in ARTIMIS Data Archives (Number
of Lanes: 4).
Example 1.1 demonstrates the great need for ranking queries in uncertain data
analysis. In traditional data analysis for deterministic data, ranking queries play an
important role by selecting the subset of records of interest according to user spec-
ified criteria. With the rapidly increasing amount of uncertain data, ranking queries
have become even more important, since the uncertainty in data not only increases
the scale of data but also introduces more difficulties in understanding and analyzing
the data.
1.2 Challenges
While being useful in many important applications, ranking queries on uncertain
data pose grand challenges to query semantics and processing.
Challenge 1 What are the uncertain data models that we need to adopt?
Example 1.1 illustrates three different application scenarios in ranking the infor-
mation obtained from traffic sensors. This not only shows the great use of ranking
queries on uncertain data, but also raises a fundamental question: how can we de-
velop uncertain data models that capture the characteristics of data and suit appli-
cation needs ?
In particular, we need to consider the following three aspects. First, is the un-
certain data static or dynamic? Second, how to describe the dependencies among
uncertain data objects? Third, how can we handle complex uncertain data like a
graph?
Challenge 2 How to formulate probabilistic ranking queries?
As shown in Example 1.1, different ranking queries on uncertain data can be asked
according to different application needs. In Scenario 1, we want to select the records
ranked in top- k with high confidence, while in Scenario 2, the objective is to find
the sensors whose records are ranked in top- k with probabilities no smaller than a
threshold in a time window. Last, in Scenario 3, we are interested in finding paths
such that the sums of the (uncertain) travel time along the path are ranked at the top.
Therefore, it is important to develop meaningful ranking queries according to
different application interests. Moreover, the probability associated with each data
 
Search WWH ::




Custom Search