Database Reference
In-Depth Information
2.2.1 Goals of Association Rule Hiding Methodologies
Association rule hiding methodologies aim at sanitizing the original database in a
way that at least one of the following goals is accomplished:
1. No rule that is considered as sensitive from the owner's perspective and can be
mined from the original database at pre-specified thresholds of confidence and
support, can be also revealed from the sanitized database, when this database is
mined at the same or at higher thresholds
2. All the nonsensitive rules that appear when mining the original database at pre-
specified thresholds of confidence and support can be successfully mined from
the sanitized database at the same thresholds or higher, and
3. No rule that was not derived from the original database when the database was
mined at pre-specified thresholds of confidence and support, can be derived from
its sanitized counterpart when it is mined at the same or at higher thresholds.
The first goal requires that all the sensitive rules disappear from the sanitized
database, when the database is mined under the same thresholds of support and
confidence as the original database, or at higher thresholds. A hiding solution that
achieves the first goal is termed feasible as it accomplishes the hiding task.
The second and the third goals involve the nonsensitive rules that may be lost
or generated as a side-effect of the employed sanitization process. Specifically, the
second goal simply states that there should be no lost rules in the sanitized database,
meaning that all the nonsensitive rules that were mined from the original database
should also be mined from its sanitized counterpart at the same (or higher) levels of
confidence and support. The third goal, on the other hand, states that no false rules
(also known as ghost rules) should be produced when mining the sanitized database
at the same (or higher) levels of confidence and support. A false (ghost) rule is an
association rule that was not among the ones mined from the original database and
thus it constitutes an artifact that was generated by the hiding process.
Based on these three goals, the sanitization process of a hiding algorithm has
to be accomplished in a way that minimally affects the original database, preserves
the general patterns and trends, and achieves to conceal all the sensitive association
rules. A solution that addresses all these three goals (i.e., is feasible and introduces
no side-effects) is called exact. Exact hiding solutions that cause the least possible
distortion (modification) to the original database are called ideal or optimal. Lastly,
non-exact but feasible solutions are called approximate.
As a final remark, we should point out that association rule hiding methodologies
usually differ in the way they rank the aforementioned goals (especially the second
and the third goal) in terms of importance of having them satisfied. With respect to
the first goal, it is interesting to notice that for any database and any set of sensitive
association rules there exists a feasible hiding solution, i.e. a solution that effectively
hides all the sensitive association rules in the database. This means that the first goal
can always be accomplished irrespective of the specific properties of the database
or the peculiarities of the hiding problem. The most trivial way to identify a feasible
Search WWH ::




Custom Search