Database Reference
In-Depth Information
5.5.2.1 Answering PT-
k
Queries
The following example illustrates how to evaluate PT-
k
queries.
Example 5.6 (Answering PT-k queries). Consider the uncertain tuples indexed in
Figure 5.3 again, and a PT-k query with k
45
.
To find the tuples satisfying the query, we only need to look at the prob-interval
containing p
=
3
and p
=
0
.
=
0
.
45
, which is b
3
=(
0
.
4
,
0
.
6
]
. In the U-list of b
3
, we find that
3
, which means that Pr
3
6
and Pr
3
t
3
.
6
. There-
fore, t
3
and t
4
can be added into the answer set without calculating their exact top-k
probabilities. In the L-list of b
3
, we find that t
2
.
U
3
=
3
and t
4
.
U
3
=
(
t
3
)
>
0
.
(
t
4
)
>
0
.
4
, which means Pr
4
L
3
=
(
t
2
)
≤
0
.
4
.
Therefore, t
2
can be pruned.
Thus, only the top-
3
probability of t
1
needs to be calculated in order to further
verify if t
1
is an answer to the query. Since Pr
3
(
t
1
)=
0
.
5
, it can be added into the
answer set. The final answer is
{
t
1
,
t
3
,
t
4
}
.
The three steps for PT-
k
query evaluation are as follows.
Step 1: Bounding
We use Corollary 5.4 to determine whether the top-
k
probability
of
t
lies in
b
i
.
Corollary 5.4 (Bounding top-
k
probabilities).
Let T be a set of uncertain tuples
indexed by
PRist
with granularity parameter h. For a tuple t
∈
T and a positive
U
i
, then
i
−
h
<
integer k, if b
i
(
1
≤
i
≤
h) is the prob-interval such that t
.
L
i
<
k
<
t
.
Pr
k
i
h
.
Proof.
According to the definition of
PRist
,wehave
Pr
t
.
L
i
(
t
)
≤
i
−
1
and
Pr
t
.
L
i
+
1
(
t
)
≤
(
t
)
>
h
i
−
1
h
L
i
,
Pr
k
Pr
t
.
L
i
+
1
i
−
1
h
. On the other hand,
Pr
t
.
U
i
i
. Since
k
>
t
.
(
t
)
≥
(
t
)
>
(
t
)
>
h
and
Pr
t
.
U
i
−
1
i
U
i
,wehave
Pr
k
Pr
t
.
U
i
−
1
i
(
t
)
≤
h
. Since
k
<
t
.
(
t
)
≤
(
t
)
≤
h
.
Step2: Pruning and Validating
A tuple may be pruned or validated by checking
its lower rank
L
i
or upper rank
U
i
in the prob-interval containing the probability
threshold, as stated in Theorem 5.10.
Theorem 5.10 (Answering PT-
k
queries).
Let T be a set of uncertain tuples in-
dexed by
PRist
with granularity parameter h. For a tuple t
T and a PT-k query Q
k
f
∈
i
−
1
i
h
]
with probability threshold p
∈
b
i
=(
,
:
h
k, then Pr
k
1.
Pruning
:ift
.
L
i
>
(
t
)
<
p;
k, then Pr
k
p.
Proof.
According to the definition of
PRist
,wehave
Pr
t
.
L
i
2.
Validating
:ift
.
U
i
≤
(
t
)
>
i
−
1
and
Pr
t
.
L
i
+
1
(
t
)
≤
(
t
)
>
h
i
−
1
h
k
,
Pr
k
Pr
t
.
L
i
i
−
1
h
. Therefore, if
t
.
L
i
>
(
t
)
≤
(
t
)
≤
<
p
. Moreover, according to
the definition of
PRist
,
Pr
t
.
U
i
h
and
Pr
t
.
U
i
−
1
i
i
h
. Thus, if
t
(
)
>
(
)
≤
.
≤
t
t
U
i
k
, then
Pr
k
Pr
t
.
U
i
i
h
>
(
t
)
≥
(
t
)
>
p
.
Step 3: Evaluating
We only need to evaluate the exact top-
k
probabilities for those
tuples whose top-
k
probabilities falling into the prob-interval containing the proba-
bility threshold
p
, and which cannot be validated or pruned by Theorem 5.10.