BBA Algorithm - Association Rule Hiding for Data Mining

Database Reference

In-Depth Information

Chapter 10

BBA Algorithm

Sun & Yu [66, 67] in 2005 proposed the first frequent itemset hiding methodology

that relies on the notion of the border [46] of the nonsensitive frequent itemsets to

track the impact of altering transactions in the original database. By evaluating the

impact of each candidate item modification to the itemsets of the revised positive

border, the algorithm greedily selects to apply those modifications (item deletions)

that cause the least impact to the border itemsets. As already covered in the previous

chapter, the border itemsets implicitly dictate the status (i.e., frequent vs. infrequent)

of every itemset in the database. Consequently, the quality of the borders directly

affects the quality of the sanitized database that is produced by the hiding algorithm.

The heuristic strategy that was proposed in [66, 67], assigns a weight to each

itemset of the revised positive border (which is the original positive border after it

has been shaped up with the removal of the sensitive itemsets) in an attempt to quan-

tify its vulnerability of being affected by an item deletion. The assigned weights are

dynamically computed during the sanitization process as a function of the current

support of the corresponding itemsets in the database. To hide a sensitive itemset,

the algorithm calculates the expected impact of each candidate item deletion to the

itemsets of the revised positive border, by computing the sum of the weights of

the revised positive border itemsets that will be affected. Then, the algorithm de-

termines the optimal deletion candidate item, which is the item whose deletion has

minimal impact on the revised positive border, and deletes this item from a set of

carefully selected transactions. The proposed strategy aims to minimize the number

of nonsensitive frequent itemsets that are affected from the hiding of the sensitive

knowledge, as well as it attempts to maintain the relative support of the nonsensitive

frequent itemsets in the sanitized database.

The rest of this chapter is organized as follows. In Section 10.1, we explicitly

state the objectives that drive the hiding process of the BBA algorithm. Following

that, Section 10.2 provides the strategy that is employed for the hiding of a sensitive

itemset in a way that bears minimal impact on the revised positive border, as well

as it presents the order in which the sensitive itemsets are selected to be hidden.

Finally, Section 10.3 delivers the pseudocode of the algorithm.

Search WWH ::

Custom Search

Home