Database Reference
In-Depth Information
solution avoids creating unreal trajectories in the sanitization process, since the
road network is publicly available knowledge and thus unreal trajectories can be
easily identified. Moreover, all sensitive patterns are hidden in D , that is, they
have a support no more than the given disclosure threshold ψ . Finally, the last
requirement is that D is kept as similar as possible to D .
9.4 Privacy by Design in Data Mining
As shown in the previous sections, several techniques have been proposed by
the scientific community to develop technological frameworks for countering
the threats of undesirable and unlawful effects of privacy violation, without
obstructing the knowledge discovery opportunities of data mining technologies.
However, the common result obtained is that no general method exists that is
capable of both dealing with “generic personal data” and preserving “generic
analytical results.” The ideal solution would be to inscribe privacy protection into
the knowledge discovery technology by design, so that the analysis incorporates
the relevant privacy requirements from the very beginning. We evoke here the
concept of “privacy by design,” coined in the 1990s by Ann Cavoukian, the
Information and Privacy Commissioner of Ontario, Canada. In brief, privacy
by design refers to the philosophy and approach of embedding privacy into the
design, operation, and management of information-processing technologies and
systems.
The articulation of the general “by design” principle in the data mining
domain is that higher protection and quality can be better achieved in a goal-
oriented approach. In such an approach, the data mining process is designed
with assumptions about:
The sensitive personal data that are the subject of the analysis;
The attack model, that is, the knowledge and purpose of a malicious party
that has an interest in discovering the sensitive data of certain individuals;
The category of analytical queries that are to be answered with the data.
Under these assumptions, it is conceivable to design a privacy-preserving
analytical process able to:
1. Transform the data into an anonymous version with a quantifiable privacy
guarantee - that is, the probability that the malicious attack fails;
2. Guarantee that a category of analytical queries can be answered correctly,
within a quantifiable approximation that specifies the data utility, using the
transformed data instead of the original ones.
In the next sections we present two frameworks that offer two different
instances of the privacy by design paradigm in the case of personal mobility
trajectories (obtained from GPS devices or cell phones). The first one is suitable
Search WWH ::




Custom Search