Database Reference
In-Depth Information
vacy regulations requires the incorporation of advanced and sophisticated privacy
preserving methodologies.
1.1 Privacy Preserving Data Mining
Privacy preserving data mining is a relatively new research area in the data mining
community, counting approximately a decade of existence. It investigates the side-
effects of data mining methods that originate from the penetration into the privacy of
individuals and organizations. Since the pioneering work of Agrawal & Srikant [8]
and Lindell & Pinkas [43] in 2000, several approaches have been proposed in the
research literature for the offering of privacy in data mining. The majority of the
proposed approaches can be classified along two principal research directions: (i)
data hiding approaches and (ii) knowledge hiding approaches.
The first direction collects methodologies that investigate how the privacy of raw
data, or information, can be maintained before the course of mining the data. The
approaches of this category aim at the removal of confidential or private information
from the original data prior to its disclosure and operate by applying techniques
such as perturbation, sampling, generalization or suppression, transformation, etc.
to generate a sanitized counterpart of the original dataset. Their ultimate goal is to
enable the data holder receive accurate data mining results when is not provided with
the real data or adhere to specific regulations pertaining to microdata publication
(e.g., as is the case of publishing patient-specific data).
The second direction of approaches involves methodologies that aim to protect
the sensitive data mining results (i.e., the extracted knowledge patterns) rather than
the raw data itself, which were produced by the application of data mining tools
on the original database. This direction of approaches mainly deals with distortion
and blocking techniques that prohibit the leakage of sensitive knowledge patterns
in the disclosed data, as well as with techniques for downgrading the effectiveness
of classifiers in classification tasks, such that the produced classifiers do not reveal
any sensitive knowledge.
1.2 Association Rule Hiding
In this topic, we focus on the knowledge hiding thread of privacy preserving data
mining and study a specific class of approaches which are collectively known
as frequent itemset and association rule hiding approaches. Other classes of ap-
proaches under the knowledge hiding thread include classification rule hiding, clus-
tering model hiding, sequence hiding, and so on and so forth. An overview of these
methodologies can be found in [4, 28, 70, 73]. Some of these methodologies are also
surveyed as part of Chapter 4 of this topic.
Search WWH ::




Custom Search