Background - Association Rule Hiding for Data Mining

Database Reference

In-Depth Information

Fig. 2.2: Examples of two lattices for a database with (i) I = fa; b; cg, and (ii)

I =fa; b; c; dg. In the first lattice we also demonstrate the positive and the negative

borders when the database D is the same as in Fig. 2.1 and msup=4.

J 2F D g. Finally, the border of F D , denoted as Bd(F D ), is the union of these two

sets: Bd(F D ) =Bd + (F D )[Bd (F D ) 3 . For example, assuming that msup= 4 in

the database of Fig. 2.1, we have that Bd + (F D ) =fab; acg, Bd (F D ) =fbcg and

Bd(F D ) = fab; ac; bcg. These borders are presented in Fig. 2.2(i). Borders allow

for a condense representation of the itemsets' lattice, identifying the key itemsets

which separate all frequent patterns from their infrequent counterparts. More details

on the theory of borders as well as its underlying concepts can be found in the work

of Mannila & Toivonen [46].

2.2 Problem Formulation and Statement

Having presented the necessary background for association rule mining, in this sec-

tion we formally set out the problem of association rule hiding. First, in Section

2.2.1, we highlight the goals of association rule hiding algorithms, rank these goals

in terms of importance of being satisfied, as well as discuss the side-effects that are

introduced when each of them is left unsatisfied in the sanitized database. Following

that, in Section 2.2.2, we deliver the formal problem statement.

3 For notational convenience, we will use Bd + and Bd to refer to the positive and the negative

border of a set of itemsets, respectively, when the set of frequent itemsets is obvious in our context.

Search WWH ::

Custom Search

Home