Database Reference
In-Depth Information
1.3 Research Challenges
The association rule hiding problem can be considered as a variation of the well
established database inference control problem [21] in statistical and multilevel
databases. The primary goal in database inference control is to block access to sensi-
tive information that can be obtained through nonsensitive data and inference rules.
In association rule hiding, we consider that it is not the data itself but rather the
sensitive association rules that create a breach to privacy. Given a set of associa-
tion rules, which are mined from a specific data collection and are considered to be
sensitive by an application specialist (e.g., the data owner), the task of association
rule hiding is to properly modify (or as is usually called sanitize 2 ) the original data
so that any association rule mining algorithms that may be applied to the sanitized
version of the data (i) will be incapable to uncover the sensitive rules under certain
parameter settings, and (ii) will be able to mine all the nonsensitive rules that ap-
peared in the original dataset (under the same or higher parameter settings) and no
other rules. The challenge that arises in the context of association rule hiding can
thus be properly stated as follows:
How can we modify (sanitize) the transactions of a database in a way that all
the nonsensitive association rules that are found when mining this database
can still be mined from its sanitized counterpart (under certain parameter set-
tings), while, at the same time, all the sensitive rules are guarded against dis-
closure and no other (originally nonexistent) rules can be mined?
Association rule hiding algorithms are especially designed to provide a solution
to this challenging problem. They accomplish this by introducing a small distortion
to the transactions of the original database in a way that they block the production
of the sensitive association rules in its sanitized counterpart, while still allowing the
mining of the nonsensitive knowledge. What differentiates the quality of one asso-
ciation rule hiding methodology from that of another is the actual distortion that is
caused to the original database, as a result of the hiding process. Ideally, the hiding
process should be accomplished in such a way that the nonsensitive knowledge re-
mains, to the highest possible degree, intact. Another very interesting problem has
been investigated recently, which even though it is not targeted to addressing privacy
issues per se, it does give a special solution to the association rule hiding problem.
The problem is known as inverse frequent itemset mining [48].
2 A dataset is said to be sanitized when it appropriately protects the sensitive knowledge from being
mined, under certain parameter settings. Similarly, a transaction of a dataset is sanitized when it
no longer supports any sensitive itemset or rule. Last, an item is called sanitized when it is altered
in a given transaction to accommodate the hiding of the sensitive knowledge.
 
Search WWH ::




Custom Search