Database Reference
In-Depth Information
ties arbitrarily). The rationale here is as follows; by identifying the sensitive itemset
with the highest support, one can safely decide upon the minimum number of trans-
actions that must not support this itemset in D X , so that it becomes infrequent in D.
This number, theoretically, is sufficient to allow the hiding of all the other itemsets
participating in S and all its supersets, and corresponds to the minimum number of
transactions that D X must have to properly secure the sensitive knowledge. Theo-
rem 16.1 demonstrates how this lower bound Q is established [26].
Theorem 16.1. Let I M 2S such that for all I 2S it holds that sup(I M ;D O )
sup(I;D O ). Then, the minimum size of D X to allow the hiding of the sensitive item-
sets from S in D equals:
j sup(I M ;D O )
mfreq
k
Q =
N
+1
(16.1)
Proof. We only need to prove that any itemset I 2S will become hidden in D if and
only if Q > sup(I;D O )=mfreqN, provided that I is not supported in D X . Since
itemset I must be infrequent in database D, the following condition holds:
sup(I;D) < msup =) sup(I;D O )+sup(I;D X ) < mfreqP
Moreover, since sup(I;D X ) 0 and (by construction) P = N +Q, we have that
sup(I;D O ) < mfreq(N +Q) =) sup(I;D O ) < N +Q
The last inequality was relaxed by removing term sup(I;D X ). Since it consists of
the summation of non-negative terms and a non-negative term was removed, the
inequality will continue to hold. Its holding proves the holding of (16.1) since the
sensitive itemset with the highest support will require the largest amount of trans-
actions (not supporting it) in the extension D X in order to be properly hidden. The
lower bound of the number of the necessary transactions for D X will thus equal
the floor value of sup(I M ;D O )=mfreqN plus one. Moreover, as expected, itemsets
having lower support than I M may be supported by some transactions of database
D X , as long as they are infrequent in D.
Equation (16.1) provides the absolute minimum size of D X to accommodate
for the sensitive knowledge hiding. However, as presented in [26] and discussed in
the following, this lower bound may, under certain circumstances, be insufficient
to allow for the identification of an exact hiding solution, even if such a solution
exists. This situation may occur if, for instance, the number of transactions returned
by (16.1) is too small to allow for consistency among the different requirements
imposed upon the status (frequent vs. infrequent) of the various itemsets appearing
in D. Later on, we present a way to overcome this limitation.
Search WWH ::




Custom Search