Privacy Preserving Publication: Anonymization Frameworks and Principles - Database Security: Applications and Trends

Databases Reference

In-Depth Information

both in QI-group 1. Hence, we proceed to calculate the probability p that a

tuple in the QI-group falls in Q (Figure 1). This calculation does not need

any assumption about the data distribution in the Age - Zipcode plane, because

the distribution is precisely released . Specifically, the QIT (Table 4a) shows

that tuples 1 and 2 in QI-group 1 appear in Q , leading to the exact p = 50%.

Thus, we obtain an answer 2 p = 1, which is also the actual query result.

4.3 Formalization of Anatomy

As with generalization, Anatomy requires partitioning the microdata T .

Definition 4. A partition consists of several subsets of T , such that each

tuple in T belongs to exactly one subset. We refer to these subsets as QI-

groups , and denote them as QI 1 , QI 2 , ..., QI m . Namely, j =1 QI j = T

and, for any 1

≤

j 1

= j 2 ≤

m, QI j 1 ∩

QI j 2 =

∅

We are interested only in l -diverse partitions that can lead to provably good

privacy guarantees. Specifically, a partition with m QI-groups is l-diverse ,if

each QI-group QI j (1

≤

m ) satisfies the following condition. Let v be the

most frequent A s

value in QI j ,and c j ( v ) the number of tuples t

∈

QI j with

t [ d +1]= v ; then

c j ( v ) /

QI j |≤

1 /l

(2)

is the size (the number of tuples) of QI j . Table 3a shows a

partition with two QI-groups, where QI 1 contains tuples 1-4, and QI 2 in-

cludes tuples 5-8. In QI 1 , dyspepsia and pneumonia are equally frequent, i.e.,

c 1 (dyspepsia) = c 1 (pneumonia) = 2. In QI 2 , the most frequent A s

QI j |

where

value is

flu, i.e., c 2 (flu) = 2. Since

= 4, according to Inequality 2, we

know that QI 1 and QI 2 constitute a 2-diverse partition.

We are ready to formulate the QIT and ST tables published by anatomy.

QI 1 |

QI 2 |

Definition 5 ([18]). Given an l-diverse partition with m QI-groups,

anatomy produces a quasi-identifier table (QIT) and a sensitive table

(ST) as follows. The QIT has schema

( A q 1 ,A q 2 , ..., A q d , Group-ID ) .

(3)

For each QI-group QI j ( 1

≤

m) and each tuple t

∈

QI j , QIT has a tuple

of the form:

( t [1] ,t [2] , ..., t [ d ] ,j ) .

(4)

The ST has schema

( Group-ID,A s , Count ) .

(5)

m) and each distinct A s value v in QI j , the

For each QI-group QI j ( 1

≤

ST has a record of the form:

( j, v, c j ( v ))

(6)

Database Security: Applications and Trends

Search WWH ::

Custom Search

Home