Databases Reference
In-Depth Information
both in QI-group 1. Hence, we proceed to calculate the probability p that a
tuple in the QI-group falls in Q (Figure 1). This calculation does not need
any assumption about the data distribution in the Age - Zipcode plane, because
the distribution is precisely released . Specifically, the QIT (Table 4a) shows
that tuples 1 and 2 in QI-group 1 appear in Q , leading to the exact p = 50%.
Thus, we obtain an answer 2 p = 1, which is also the actual query result.
4.3 Formalization of Anatomy
As with generalization, Anatomy requires partitioning the microdata T .
Definition 4. A partition consists of several subsets of T , such that each
tuple in T belongs to exactly one subset. We refer to these subsets as QI-
groups , and denote them as QI 1 , QI 2 , ..., QI m . Namely, j =1 QI j = T
and, for any 1
j 1
= j 2
m, QI j 1
QI j 2 =
.
We are interested only in l -diverse partitions that can lead to provably good
privacy guarantees. Specifically, a partition with m QI-groups is l-diverse ,if
each QI-group QI j (1
j
m ) satisfies the following condition. Let v be the
most frequent A s
value in QI j ,and c j ( v ) the number of tuples t
QI j with
t [ d +1]= v ; then
c j ( v ) /
|
QI j |≤
1 /l
(2)
is the size (the number of tuples) of QI j . Table 3a shows a
partition with two QI-groups, where QI 1 contains tuples 1-4, and QI 2 in-
cludes tuples 5-8. In QI 1 , dyspepsia and pneumonia are equally frequent, i.e.,
c 1 (dyspepsia) = c 1 (pneumonia) = 2. In QI 2 , the most frequent A s
|
QI j |
where
value is
flu, i.e., c 2 (flu) = 2. Since
= 4, according to Inequality 2, we
know that QI 1 and QI 2 constitute a 2-diverse partition.
We are ready to formulate the QIT and ST tables published by anatomy.
|
QI 1 |
=
|
QI 2 |
Definition 5 ([18]). Given an l-diverse partition with m QI-groups,
anatomy produces a quasi-identifier table (QIT) and a sensitive table
(ST) as follows. The QIT has schema
( A q 1 ,A q 2 , ..., A q d , Group-ID ) .
(3)
For each QI-group QI j ( 1
j
m) and each tuple t
QI j , QIT has a tuple
of the form:
( t [1] ,t [2] , ..., t [ d ] ,j ) .
(4)
The ST has schema
( Group-ID,A s , Count ) .
(5)
m) and each distinct A s value v in QI j , the
For each QI-group QI j ( 1
j
ST has a record of the form:
( j, v, c j ( v ))
(6)
Search WWH ::




Custom Search