Databases Reference
In-Depth Information
both in QI-group 1. Hence, we proceed to calculate the probability
p
that a
tuple in the QI-group falls in
Q
(Figure 1). This calculation does not need
any assumption about the data distribution in the
Age
-
Zipcode
plane,
because
the distribution is precisely released
. Specifically, the QIT (Table 4a) shows
that tuples 1 and 2 in QI-group 1 appear in
Q
, leading to the
exact p
= 50%.
Thus, we obtain an answer 2
p
= 1, which is also the actual query result.
4.3 Formalization of Anatomy
As with generalization, Anatomy requires partitioning the microdata
T
.
Definition 4.
A
partition
consists of several subsets of T , such that each
tuple in T belongs to exactly one subset. We refer to these subsets as
QI-
groups
, and denote them as QI
1
, QI
2
, ..., QI
m
. Namely,
j
=1
QI
j
=
T
and, for any
1
≤
j
1
=
j
2
≤
m, QI
j
1
∩
QI
j
2
=
∅
.
We are interested only in
l
-diverse partitions that can lead to provably good
privacy guarantees. Specifically, a partition with
m
QI-groups is
l-diverse
,if
each QI-group
QI
j
(1
≤
j
≤
m
) satisfies the following condition. Let
v
be the
most frequent
A
s
value in
QI
j
,and
c
j
(
v
) the number of tuples
t
∈
QI
j
with
t
[
d
+1]=
v
; then
c
j
(
v
)
/
|
QI
j
|≤
1
/l
(2)
is the size (the number of tuples) of
QI
j
. Table 3a shows a
partition with two QI-groups, where
QI
1
contains tuples 1-4, and
QI
2
in-
cludes tuples 5-8. In
QI
1
, dyspepsia and pneumonia are equally frequent, i.e.,
c
1
(dyspepsia) =
c
1
(pneumonia) = 2. In
QI
2
, the most frequent
A
s
|
QI
j
|
where
value is
flu, i.e.,
c
2
(flu) = 2. Since
= 4, according to Inequality 2, we
know that
QI
1
and
QI
2
constitute a 2-diverse partition.
We are ready to formulate the QIT and ST tables published by anatomy.
|
QI
1
|
=
|
QI
2
|
Definition 5 ([18]).
Given an l-diverse partition with m QI-groups,
anatomy
produces a
quasi-identifier table
(QIT) and a
sensitive table
(ST) as follows. The QIT has schema
(
A
q
1
,A
q
2
, ..., A
q
d
, Group-ID
)
.
(3)
For each QI-group QI
j
(
1
≤
j
≤
m) and each tuple t
∈
QI
j
, QIT has a tuple
of the form:
(
t
[1]
,t
[2]
, ..., t
[
d
]
,j
)
.
(4)
The ST has schema
(
Group-ID,A
s
, Count
)
.
(5)
m) and each distinct A
s
value v in QI
j
, the
For each QI-group QI
j
(
1
≤
j
≤
ST has a record of the form:
(
j, v, c
j
(
v
))
(6)