Database Reference
In-Depth Information
Chapter 2
Probabilistic Ranking Queries on Uncertain
Data
In this chapter, we formulate the probabilistic ranking queries on uncertain data.
We first introduce two basic uncertain data models and basic probabilistic ranking
queries. Then, we discuss three extended uncertain data models that suit different
application scenarios. Ranking queries on the extended uncertain data models are
also developed.
Frequently used definitions and notations are listed in Table 2.1.
2.1 Basic Uncertain Data Models
We consider uncertain data in the possible worlds semantics model [23, 12, 24, 7],
which has been extensively adopted by the recent studies on uncertain data pro-
cessing, such as [17, 8, 21]. Technically, uncertain data can be represented in two
ways.
2.1.1 Uncertain Object Model
An uncertain object O [18, 19, 20, 21] is conceptually governed by an underlying
random variable X . Theoretically, if X is a continuous random variable, the distri-
bution of X can be captured by a probability density function ( PDF for short); if X
is a discrete random variable, its distribution can be described by a probability mass
function ( PMF for short). In practice, the PDF or PMF of a random variable is often
unavailable. Instead, a sample set of instances x 1 ,···,
x m are used to approximate
the distribution of X , where each instance takes a membership probability. For an
instance x i
m ), the membership probability of x i measures the likeli-
hood that x i will occur. Due to the unavailability of X 's PDF or PMF, in this topic,
we represent an uncertain object O using the set of samples x 1 ,···,
X (1
i
x m generated by
the underlying random variable.
9
Search WWH ::




Custom Search