Database Reference
In-Depth Information
capture transactions in D. Since J I, it holds that D
I
D
J
. Then, provided that
A =
å
T
q
2(D
J
nD
I
)
Õ
i
m
2J
u
qm
the following must hold for database D:
A+sup(J;D
O
)sup(I;D
O
) 0
(16.6)
From (16.5) and (16.6) we have that:
Q
q=1
i
m
2I
u
qm
msupsup(J;D
O
)
A+
and therefore
Q
q=1
i
m
2J
u
qm
msupsup(J;D
O
)
Theorem 16.3. If F
D
is the set of all frequent itemsets in D, then
F
D
=[
I2Bd
+
(F
D
)
Ã(I).
Proof. Due to the definition of the positive border it holds that Bd
+
(F
D
) F
D
.
Based on the downward closure property of the frequent itemsets, all subsets of
a frequent itemset are also frequent. Therefore, all subsets of Bd
+
(F
D
) must be
frequent, which means that [
I2Bd
+
(F
D
)
Ã(I) F
D
. To prove the theorem, we
need to show that F
D
[
I2Bd
+
(F
D
)
Ã(I) or (equivalently) that if I 2F
D
, then
I 2[
K2Bd
+
(F
D
)
Ã(K) )9J 2Bd
+
(F
D
) : I J. Assume that this condition does
not hold. Then, it must hold that I 2F
D
and that @J 2Bd
+
(F
D
) : I J. From the
last two conditions we conclude that I 2Bd
+
(F
D
) since, if I 2Bd
+
(F
D
) then the
second condition does not hold for J = I. Therefore, and due to the definition of the
positive border, we have that 9J
1
2F
D
: I J
1
. Thus, jJ
1
j>jIj)jJ
1
jjIj+1 )
jJ
1
j 1. Moreover, since @J 2Bd
+
(F
D
) : I J it means that J
1
2Bd
+
(F
D
), and
since J
1
2F
D
, 9J
2
2F
D
: J
1
J
2
, therefore jJ
2
j>jJ
1
j)jJ
2
jjJ
1
j+ 1. Finally,
since jJ
1
j 1 we conclude that jJ
2
j 2.
As it can be noticed, using induction it is easy to prove that 8k 2À
:9J
k
2F
D
:
jJ
k
j k. This means that there is no upper bound in the size of frequent itemsets,
a conclusion that is obviously wrong. Therefore, the initial assumption that I 2F
D
and @J 2Bd
+
(F
D
) : I J does not hold which means that F
D
[
I2Bd
+
(F
D
)
Ã(I).
Theorems 16.2 and 16.3 prove the cover relations that exist between the itemsets
of Bd
+
(F
D
) and those of F
0
D
. In the same manner one can prove that the itemsets
of Bd
(F
D
) are generalized covers for all the itemsets of Pn(Bd
+
(F
0
D
)[f?g).
Therefore, the itemsets of the positive and the negative borders cover all the itemsets
in P. As a result, the following Corollary to Theorems 16.2 and 16.3 holds.
Corollary 16.1. (Optimal solution set C) The exact hiding solution, which is iden-
tical to the solution of the entire system of the 2
M
1 inequalities, can be attained
based on the itemsets of set
C =Bd
+
(F
0
D
)[Bd
(F
0
D
)
(16.7)