Database Reference
In-Depth Information
move items from transactions that support the sensitive rules. The first algorithm,
MinFIA, operates as follows. For every restrictive pattern it identifies the support-
ing transactions and the item having the smallest support in the pattern (called the
victim item). Then, by using a user-supplied disclosure threshold, it first sorts the
identified transactions in ascending order of degree of conflict and then selects the
number of transactions (among them) that need to be sanitized. Finally, from each
selected transaction the algorithm removes the victim item. The MaxFIA algorithm
proceeds exactly as the MinFIA with the only difference of selecting as the victim
item the one that has the maximum support in the sensitive association rule. Fi-
nally, IGA aims at clustering the restricted patterns into groups that share the same
itemsets. By identifying overlapping clusters, the algorithm hides the corresponding
sensitive patterns at once (based on the sensitive itemsets they share), thus reduces
the distortion that is induced to the database when hiding the sensitive knowledge.
A more efficient approach than that of [54] and the work of [20, 63, 64], was
introduced by Oliveira & Zaïane in [55]. The proposed algorithm, called SWA, is an
efficient, scalable, one-scan heuristic which aims at providing a balance between the
needs for privacy and knowledge discovery in association rule hiding. It achieves to
hide multiple rules in only one pass through the dataset, regardless of its size or the
number of sensitive rules that need to be protected. The algorithm proceeds in five
steps that are applied to every group of K transactions (thus formulating a window
of size K) read from the original database. Initially, the nonsensitive transactions
are separated from the sensitive ones and copied directly to the sanitized database.
For each sensitive rule, the item having the highest frequency is selected and its
supporting transactions are identified. Then, a disclosure threshold y, potentially
different for each sensitive rule, is used to capture the severity characterizing the
release of the rule. Based on this threshold, SWA computes the number of supporting
transactions that need to be sanitized for each rule and then sorts them in ascending
order of length. For each selected transaction, the item with the highest frequency
as identified before is removed and then the transaction is copied to the sanitized
database. SWA is experimentally shown to outperform the state-of-the-art heuristic
approaches in terms of concealing all the sensitive rules, while maintaining high
data utility of the released database.
Amiri [9] proposes three effective, multiple association rule hiding heuristics
that outperform SWA by offering higher data utility and lower distortion, at the
expense of increased computational speed. Although similar in philosophy to the
previous approaches, the three proposed methodologies do a better job in modeling
the overall objective of a rule hiding algorithm. The first approach, called Aggregate,
computes the union of the supporting transactions for all sensitive itemsets. Among
them, the transaction that supports the most sensitive and the least nonsensitive item-
sets is selected and expelled from the database. The same process is repeated until
all the sensitive itemsets are hidden. Similarly to this approach, the Disaggregate
approach aims at removing individual items from transactions, rather than remov-
ing the entire transaction. It achieves that by computing the union of all transactions
supporting sensitive itemsets and then, for each transaction and supporting item, by
calculating the number of sensitive and nonsensitive itemsets that will be affected if
Search WWH ::




Custom Search