Quantifying the Privacy of Exact Hiding Algorithms - Association Rule Hiding for Data Mining

Database Reference

In-Depth Information

Fig. 18.1: A layered approach to quantifying the privacy that is offered by the exact

hiding algorithms.

Figure 18.1(i) demonstrates the layered approach of [27] as applied on a sanitized

database D. The support-axis (shown vertically in the figure) is partitioned into

two regions with respect to the minimum support threshold msup that is used for

the mining of the frequent itemsets in D. In the upper region (above msup), Layer

0 contains all the frequent itemsets that are found in D after the application of a

frequent itemset mining algorithm like Apriori [7]. The value of MSF indicates the

maximum support of a frequent itemset in D. The region starting just below msup

contains all the infrequent itemsets, including the sensitive ones, provided that they

were appropriately covered up by the applied hiding algorithm. The region below

msup is further partitioned into three layers, defined as follows:

Layer 1 This layer spans from the infrequent itemsets having a maximum support

(MSI) to the sensitive itemsets with a maximum support (MSS), excluding the

latter ones. It models the “gap” that may exist below the borderline, either due

to the use of a margin of safety to better protect the sensitive knowledge (as is

the typical case in various hiding approaches, e.g. [63]), or due to the properties

of the original database D O and the sensitive itemsets that were selected to be

hidden. This layer is assumed to contain y itemsets.

Layer 2 This layer spans from the sensitive itemsets having a maximum support

(MSS) to the sensitive itemsets with the minimum support (mSS), inclusive. It

contains all the sensitive knowledge that the owner wishes to protect, possibly

along with some nonsensitive infrequent itemsets. This layer is assumed to con-

tain s itemsets out of which S are the sensitive ones.

Search WWH ::

Custom Search

Home