Database Reference
In-Depth Information
Algorithm 5.3 The exact algorithm with reordering and pruning techniques
Input: an uncertain table T , a set of generation rules
, a top- k query Q k f , and a probability
R
threshold p
Output: Answer ( Q , p , T )
Method:
1: retrieve tuples in T in the ranking order one by one
2: for all t i T do
3:
compute T ( t i ) by rule-tuple compression and reordering
compute subset probability values and Pr k
( t i )
4:
if Pr k
( t i ) p then
6: output t i
7: end if
8: check whether t i can be used to prune future tuples
9: if all remaining tuples in T fail the probability threshold then
10: exit
11: end if
12: end for
5:
In summary, the exact algorithm for PT- k query answering is shown in Algo-
rithm 5.3. We analyze the complexity of the algorithm as follows.
For a multi-tuple rule R : t r 1 ,···,
t r m where t r 1 ,...,
t r m are in the ranking order, let
span
is processed, we need to remove rule-
tuple t r 1 ,..., r l 1 , and compute the subset probability values of the updated compressed
dominant sets. When the next tuple not involved in R is processed, t r 1 ,..., r l 1 and t r l
are combined. Thus, in the worst case, each multi-tuple rule causes the computation
of O
(
R
)=
r m
r 1 . When tuple t r l (
1
<
l
m
)
subset probability values. Moreover, in the worst case where each
tuple T passes the probability threshold, all tuples in T have to be read at least once.
The time complexity of the whole algorithm is O
(
2 k
·
span
(
R
))
(
))
.
As indicated by our experimental results, in practice the four pruning rules are ef-
fective. Often, only a very small portion of the tuples in T are retrieved and checked
before the exact answer to a PT- k query is obtained.
Interestingly, since PT- k query answering methods can be extended to evaluate
top-
(
kn
+
k
R ∈R span
R
queries (as discussed in Section 5.2.1), the pruning
techniques introduced in this section can be applied to answering top-
(
k
,
l
)
queries and top-
(
p
,
l
)
(
k
,
l
)
queries
and top-
(
p
,
l
)
queries as well.
5.3 A Sampling Method
One may trade off the accuracy of answers against the efficiency. In this section, we
present a simple yet effective sampling method for estimating top- k probabilities of
tuples.
For a tuple t , let X t be a random variable as an indicator to the event that t
is ranked top- k in possible worlds. X t =
1if t is ranked in the top- k list, and
X t =
0 otherwise. Apparently, the top- k probability of t is the expectation of X t ,
 
Search WWH ::




Custom Search