Database Reference
In-Depth Information
rule−tuple compression
ti
tR
ti
generation rule R
generation rule−tuple, Pr(tR)=Pr(R)
Case 2: ti is ranked lower than all tuples in R
ti
rule−tuple compression
tR_left
ti
generation rule R
generation rule R
Case 3 (Subcase 1): ti is ranked between tuples in R and ti is not in R
Fig. 5.1
Computing
Pr
k
(
t
i
)
for one tuple
t
i
.
Corollary 5.2 (Tuple in generation rule).
For a tuple t
∈
R such that
|
R
| >
1
,
Pr
k
Q
,
T
(
Pr
k
Q
,
T
(
where uncertain table T
=(
t
|
t
∈
.
Proof.
Since the tuples in
R
are mutually exclusive, the probability that one tuple in
R
appears is the sum of the membership probability of each tuple in
R
. Therefore,
the corollary holds.
t
)=
t
)
S
t
i
−{
R
}
)
∪{
t
}
For a tuple
t
and its dominant set
S
t
, we can check
t
against the multi-tuple
generation rules one by one. Each multi-tuple generation rule can be handled by
one of the above two cases as illustrated in Figure 5.1, and the dependent tuples
in
S
t
can be either compressed into some generation rule-tuples or removed due to
the involvement in the same generation rule as
t
. After the generation rule-tuple
compression, the resulting set is called the
compressed dominant set
of
t
, denoted
by
T
(
t
)
. Based on the above discussion, for a tuple
t
∈
T
, all tuples in
T
(
t
)
∪{
t
}
are independent,
Pr
k
Q
,
T
(
Pr
k
Q
,
T
(
t
)
∪{
t
}
(
t
)=
t
)
. We can apply Theorem 5.2 to calculate
Pr
k
(
t
)
by scanning
T
(
t
)
once.
Example 5.1 (Generation rule-tuple compression). Consider a list of tuples t
1
,···,
t
11
in the ranking order. Suppose we have two multi-tuple generation rules: R
1
=
t
2
⊕
t
7
. Let us consider how to compute Pr
3
and Pr
3
t
4
⊕
t
9
and R
2
=
t
5
⊕
(
t
6
)
(
t
7
)
.
Tuple t
6
is ranked between tuples in R
1
, but t
6
∈
R
1
. The first subcase of Case 3
should be applied. Thus, we compress R
1
left
=
{
t
2
,
t
4
}
into an generation rule-
(
)=
(
)+
(
)
tuple t
2
,
4
with membership probability Pr
t
2
,
4
Pr
t
2
Pr
t
4
. Similarly, t
6
is
also ranked between tuples in R
2
and t
6
∈
R
2
, but R
2
left
=
{
}
t
5
. The compression
does not remove any tuple. After the compression, T
(
t
6
)=
{
t
1
,
t
2
,
4
,
t
3
,
t
5
}
. Since the
are independent, we can apply Theorem 5.2 to compute Pr
3
tuples in T
(
t
6
)
∪{
t
6
}
(
t
6
)
using T
.
Since t
7
∈
(
t
6
)
R
2
, the tuples in R
2
except for t
7
itself should be removed. Thus, we
have T
(
t
7
)=
{
t
1
,
t
2
,
4
,
t
3
,
t
6
}
.
We can sort all tuples in
T
into a sorted list
L
in the ranking order. For each tuple
t
i
, by one scan of the tuples in
L
before
t
i
, we obtain the compressed dominant set