Database Reference
In-Depth Information
Step 3: Evaluating If, before prob-interval b i is scanned, B has l tuples such that
l <
l , but after b i is scanned, B has l or more tuples, we enter the evaluating step.
If B has exactly l tuples, we do not need to do anything. However, if B has more
than l tuples, we need to evaluate the exact top- k probabilities of those tuples added
to B from b i , and include into B only the top-
l )
(
l
tuples of the largest top- k
probabilities.
When a tuple can be added into the answer set without the evaluating step, the
time to retrieve the tuple is constant. There are at most l such tuples. Only the top- k
probabilities of the tuples in the last prob-interval need to be evaluated. Let d be
the number of such tuples. Then, the time complexity of evaluating those tuples is
O
(
kmd
)
, as computing the top- k probabilities of those tuples requires O
(
kmd
)
time.
The overall time complexity of query answering is O
(
l
+
kmd
)
.
5.5.2.3 Answering Top-
(
p
,
l
)
Queries
Example 5.8 (Answering top-
queries). Consider the uncertain tuples in Fig-
ure 5.3 again. How can we answer the top-
(
p
,
l
)
(
p
,
l
)
query with p
=
0
.
5 and l
=
2 ?
We only need to check prob-interval b 3 =(
0
.
4
,
0
.
6
]
that contains p. First, the
top- 2 highest ranks in b 3 .
U 3 are 3 for tuples t 3 and t 4 , which means MR 0 . 5 (
t 3 )
3
and MR 0 . 5 (
t 4 )
3 . Therefore, we can set 3 as the upper bound of the 0
.
5 -ranks, and
any tuple t with MR 0 . 5 >
3 cannot be the answer to the query.
Second, we scan list b 3 .
L 3 from the beginning. t 1 is added into a buffer B, since
t 1 .
3 and MR 0 . 5 might be smaller than or equal to 3 .t 3 and t 4 are added into
B due to the same reason. Then, when scanning t 2 , we find t 2 .
L 3 <
L 3 =
4 , which means
MR 0 . 5 (
4 . Thus, t 2 and any tuples following t 2 in the list (in this
example, no other tuples) can be pruned.
Last, the 0
t 2 )
MR 0 . 4 (
t 2 )
5 -ranks of the tuples in B (i.e., t 1 ,t 3 , and t 4 ) are calculated, and the
tuples with the top- 2 highest 0
.
.
5 -ranks are returned as the answer to the query,
which are t 1 and t 3 .
query, the three steps work as follows.
Step 1: Bounding The p -rank of a tuple can be bounded using the following rule.
To answer a top-
(
p
,
l
)
Theorem 5.12 (Bounding p -ranks). Let T be a set of uncertain tuples indexed by
PRist with granularity parameter h. For a tuple t
S and p
(
0
,
1
]
, let b i ( 1
i
h)
i
1
h <
i
h . Then, t
be the prob-interval such that p
b i , i.e.,
p
.
L i
MR p (
t
)
t
.
U i .
Proof. To prove Theorem 5.12, we need the following two lemmas.
Lemma 5.2 (Monotonicity). Let t be an uncertain tuple, and p 1 and p 2 be two real
values in
(
0
,
1
]
.Ifp 1
p 2 , then MR p 1 (
t
)
MR p 2 (
t
)
.
k 1 . Then, Pr k 1
k 1 , Pr k
Proof. Let MR p 1 (
t
)=
(
t
)
p 1 . Moreover, for any k
<
(
t
) <
p 1 .
k 1 , Pr k
Since p 1
p 2 , for any k
<
(
t
) <
p 2 . Therefore, MR p 2 (
t
)
k 1 =
MR p 1 (
t
)
.
Lemma 5.3. Let t be an uncertain tuple, k be a positive integer, and p be a real
values in
.IfPr k
(
0
,
1
]
(
t
)
p, then MR p (
t
)
k.
Proof. Since Pr k
k , Pr x
(
t
)
p , for any x
(
t
)
p . Thus, MR p (
t
)
k .
 
Search WWH ::




Custom Search