Hybrid Algorithm - Association Rule Hiding for Data Mining

Database Reference

In-Depth Information

capture transactions in D. Since J I, it holds that D I D J . Then, provided that

A = å T q 2(D J nD I ) Õ i m 2J u qm the following must hold for database D:

A+sup(J;D O )sup(I;D O ) 0

(16.6)

From (16.5) and (16.6) we have that:

q=1

i m 2I u qm msupsup(J;D O )

A+

and therefore

q=1

i m 2J u qm msupsup(J;D O )

Theorem 16.3. If F D is the set of all frequent itemsets in D, then

F D =[ I2Bd + (F D ) Ã(I).

Proof. Due to the definition of the positive border it holds that Bd + (F D ) F D .

Based on the downward closure property of the frequent itemsets, all subsets of

a frequent itemset are also frequent. Therefore, all subsets of Bd + (F D ) must be

frequent, which means that [ I2Bd + (F D ) Ã(I) F D . To prove the theorem, we

need to show that F D [ I2Bd + (F D ) Ã(I) or (equivalently) that if I 2F D , then

I 2[ K2Bd + (F D ) Ã(K) )9J 2Bd + (F D ) : I J. Assume that this condition does

not hold. Then, it must hold that I 2F D and that @J 2Bd + (F D ) : I J. From the

last two conditions we conclude that I 2Bd + (F D ) since, if I 2Bd + (F D ) then the

second condition does not hold for J = I. Therefore, and due to the definition of the

positive border, we have that 9J 1 2F D : I J 1 . Thus, jJ 1 j>jIj)jJ 1 jjIj+1 )

jJ 1 j 1. Moreover, since @J 2Bd + (F D ) : I J it means that J 1 2Bd + (F D ), and

since J 1 2F D , 9J 2 2F D : J 1 J 2 , therefore jJ 2 j>jJ 1 j)jJ 2 jjJ 1 j+ 1. Finally,

since jJ 1 j 1 we conclude that jJ 2 j 2.

As it can be noticed, using induction it is easy to prove that 8k 2À :9J k 2F D :

jJ k j k. This means that there is no upper bound in the size of frequent itemsets,

a conclusion that is obviously wrong. Therefore, the initial assumption that I 2F D

and @J 2Bd + (F D ) : I J does not hold which means that F D [ I2Bd + (F D ) Ã(I).

Theorems 16.2 and 16.3 prove the cover relations that exist between the itemsets

of Bd + (F D ) and those of F 0 D . In the same manner one can prove that the itemsets

of Bd (F D ) are generalized covers for all the itemsets of Pn(Bd + (F 0 D )[f?g).

Therefore, the itemsets of the positive and the negative borders cover all the itemsets

in P. As a result, the following Corollary to Theorems 16.2 and 16.3 holds.

Corollary 16.1. (Optimal solution set C) The exact hiding solution, which is iden-

tical to the solution of the entire system of the 2 M 1 inequalities, can be attained

based on the itemsets of set

C =Bd + (F 0 D )[Bd (F 0 D )

(16.7)

Search WWH ::

Custom Search

Home