Databases Reference
In-Depth Information
20
Privacy Preserving Publication:
Anonymization Frameworks and Principles
Yufei Tao
Department of Computer Science and Engineering
Chinese University of Hong Kong
Sha Tin, New Territories, Hong Kong
taoyf@cse.cuhk.edu.hk
Summary. Given a microdata table T , the objective of privacy preserving pub-
lication is to release a distorted version T of T such that T does not allow an
adversary to confidently derive the sensitive data of any individual, and yet, T can
be used to analyze the statistical patterns significant in T . The existing methods
of privacy preserving publication is essentially the integration of an anonymiza-
tion framework and an anonymization principle . Specifically, a framework describes
how anonymization is performed, whereas a principle measures whether a sucient
amount of anonymization has been applied. In this chapter, we will discuss the char-
acteristics of two existing frameworks: generalization and anatomy, and of two most
popular principles: k -anonymity and l -diversity.
1 Introduction
This chapter will discuss an important problem, known as privacy preserving
publication , in the literature of data privacy protection. Formally, we have a
trustable publisher that has a microdata table T , where each tuple describes
the information of an individual. For our discussion, assume that T has d non-
sensitive attributes A 1 , A 2 , ..., A d and a sensitive attribute A s . The objective
is to publish an anonymized version T of T such that T does not allow an
adversary to confidently derive the sensitive data of any individual, and yet,
T can be used to analyze the statistical patterns significant in T .
As a concrete application example, consider that the publisher is a hospital,
and T is given in Table 1a. Here, T has three non-sensitive attributes A 1 =
Age , A 2 = Sex , A 3 = Zipcode , and a sensitive attribute A s = Disease .The
column Name specifies the owners of the tuples, e.g., Tuple 1 indicates that
Andy, aged 5, lives in a neighborhood with Zipcode 12000, and he contracted
gastric-ulcer . Obviously, Name should not be published along with T , since it
explicitly reveals the identities of all individuals.
Let T be the resulting table after removing Name from T . At first glance,
it appears that we can simply release T , which, by itself , does not contain any
Search WWH ::




Custom Search