Databases Reference
In-Depth Information
We are now ready to relate the definition of k-anonymity with the
GBP
model. Under additional assumption
A4
,if
g
yields a k-anonymization of
R
then the a priori probability of
1
|R|
S
r
is
and the a posteriori probability is
1
1
[
r
]
g
|
≤
k
:
|
A
g
,
1
1
(anon
k
(
g
)
BFBR
R
{
∧
A4
)
⇔
S
r
(
V
,
k
−
)
(6)
u
}
,
|
R
|
r∈R
A
g
,
1
BFBR
R
⇒
{u},S
r
(
V
,
k
)
.
(7)
r
∈
R
(7) states that under assumption
A4
the amount of belief revision for each
secret
S
r
is bounded by a constant rather than the size of the database.
We discuss next a widely applicable guarantee that lifts restriction
A4
,
relaxes restriction
A3
, and still bounds the amount of belief revision by an
owner-defined constant.
4.2 L-Diversity
Machanavajjhala et al. [16] point out two key deficiencies of the k-anonymity
guarantee: it does not withstand so-called
homogeneity
and
background
at-
tacks.
In the general case when sensitive attribute values may occur more than
once in
R
, vulnerability to homogeneity attacks arises whenever few sensitive
values occur with high multiplicity in an equivalence class. In particular, when
all tuples in
r
's equivalence class share the same sensitive value
s
, any attacker
can infer with certainty that
r
[
ID
] is associated with
s
. In this case, the
attacker learns the maximum possible amount of information about the secret
S
r
since its a posteriori probability is 1.
In background attacks, the attacker exploits external background informa-
tion to rule out a number of sensitive values as being definitely
not
associated
to
r
[
ID
]. The remaining alternatives are considered equi-probable. This class
of attackers is not covered by k-anonymity, which considers the single attacker
who a priori deems all associations equi-probable.
[16] proposes the concept of
l-diversity
to remedy these deficiencies of k-
anonymity. The intuition behind this concept is to defend against attackers
who are able to rule out at most
l
−
1 sensitive values from the equivalence
class of each
r
R
, by ensuring that the frequency of each sensitive value in
the remaining set of tuples is upper bounded by an owner-defined threshold.
[16] introduces the notion of
recursive
(
c, l
)
-diversity
as a sucient condition
for l-diversity.
For every
r
∈
R
,let
o
be the number of distinct sensitive values occurring in
r
's equivalence class. Let their list be
s
1
,...,s
o
, and let
m
i
be the multiplicity
of
s
i
in
r
's equivalence class. Assuming w.l.o.g. that
m
1
≥
∈
m
2
≥
...
≥
m
o
,we
say that the equivalence class of
r
satisfies recursive (
c, l
)-diversity if