Database Reference
In-Depth Information
2.2.2 Ranking Uncertain Instances in Multiple Uncertain Objects
Given multiple uncertain objects, how can we select a small subset of instances
meeting users' interests? Ranking queries (also known as top-
k
queries) [3, 4, 5, 6]
are a class of important queries in data analysis that allows us to select the instances
ranked top according to certain user specified scoring functions. We consider the
top-
k
selection query model [29].
Definition 2.10 (Top-
k
selection queries).
For a set of instances
S
, each instance
o
S
is associated with a set of attributes
A
. Given a predicate
P
on
A
, a ranking
function
f
:
S
∈
0, a
top-
k
selection query
Q
P
,
f
returns a set
→
R
and a integer
k
>
of instances
Q
P
,
f
(
Q
P
,
f
(
S
)
⊆
S
P
, where
S
P
is the set of instances satisfying
P
,
|
S
)
|
=
o
)
Q
P
,
f
(
and
o
∈
Q
P
,
f
(
min
{
k
,|
S
P
|}
and
f
(
o
)
>
f
(
for any instances
o
∈
S
)
S
P
−
S
)
.
To keep our presentation simple, we assume that the top-
k
selection queries in our
discussion select all instances in question. That is
S
P
=
S
. Those selection predicates
can be implemented efficiently as filters before our ranking algorithms are applied.
Moreover, we assume that the ranking function
f
in a top-
k
selection query can be
efficiently applied to an instance
o
to generate a score
f
(
o
)
. When it is clear from
context, we also write
Q
P
,
f
as
Q
k
for the sake of simplicity.
2.2.2.1 Ranking Probabilities
How can we apply a top-
k
selection query to a set of uncertain objects? Since each
object appears as a set of instances, we have to rank the instances in the possible
worlds semantics.
A top-
k
selection query can be directly applied to a possible world that consists
of a set of instances. In a possible word, a top-
k
selection query returns
k
instances.
We define the rank-
k
probability and top-
k
probability for instances and objects as
follows.
Given a set of uncertain objects and a
ranking function f
, all instances of the
uncertain objects can be ranked according to the ranking function. For instances
o
1
and
o
2
,
o
1
f
o
2
if
o
1
is ranked higher than or equal to
o
2
according to
f
. The
ranking order
f
is a total order on all instances.
Definition 2.11 (rank-
k
Probability and top-
k
probability).
For an instance
o
, the
rank-
k
probability
Pr
is the probability that
o
is ranked at the
k
-th position in
possible worlds according to
f
, that is
(
o
,
k
)
∑
W
∈W
s.t.
o
=
W
f
(
k
)
Pr
(
o
,
k
)=
Pr
(
W
)
(2.1)
where
W
f
(
denotes the instance ranked at the
k
-th position in
W
.
The
top-
k
probability
Pr
k
k
)
(
o
)
is the probability that
o
is ranked top-
k
in possible
worlds according to
f
, that is,