Privacy Preserving Publication: Anonymization Frameworks and Principles - Database Security: Applications and Trends

Databases Reference

In-Depth Information

( t [1] ,t [2] , ..., t [ d ] ,j,v,c j ( v ))

(8)

QI j ), v an A s value, and

c j ( v ) the number of tuples in QI j with A s value v. Then, from an adversary's

perspective,

where j is the ID of the QI-group including t (i.e., t

∈

{

t [ d +1]= v

}

= c j ( v ) /

QI j |

(9)

where

QI j |

denotes the size of QI j .

Corollary 1 ([18]). Given a pair of QIT and ST, an adversary can correctly

re-construct any tuple t

∈

T with a probability at most 1 /l.

Corollary 1 gives the privacy protection guarantee at the tuple level .It

is also necessary to discuss the corresponding guarantee at the individual

level , since in practice multiple individuals may have the same QI-values, thus

complicating the privacy-attack process performed by an adversary.

To explain this, consider that an adversary has the age 65 and zipcode

25000 of Alice (the “owner” of tuple 7 in Table 3a), and wants to infer the

medical record of Alice from the QIT and ST in Tables 4a and 4b, respec-

tively. S/he consults the QIT, and sees that, in QI-group 2 (denoted as QI 2 ),

both tuples 6 and 7 match the QI-values of Alice. Hence, s/he examines two

scenarios.

First, assuming that tuple 6 belongs to Alice, the adversary uses Lemma 1

to derive the probability distribution for the tuple's disease value. According

to Equation 9, tuple 6 has probability c 2 (flu) /

=2 / 4 = 50% to carry

flu. Notice that, in the microdata, tuple 6 does not really belong to Alice.

However, it does not matter — the adversary may “happen to” use a wrong

tuple to infer the correct sensitive value of Alice! From tuple 6, the adversary

actually has 50% probability to figure out that Alice contracted flu.

In the second scenario, the adversary assumes that tuple 7 belongs to Alice,

through which (similar to tuple 6) s/he also has 50% probability to obtain

the real disease of Alice. Finally, (without further knowledge) the adversary

assumes that the two scenarios occur with the same likelihood

QI 2 |

2 . Therefore,

50% + 2 ·

the overall breach probability should be calculated as

2 ·

50%, where

2 and 50% have the same semantics as in the above discussion.

In fact, Lemma 1 shows that tuple 7 (the real tuple of Alice) can be

re-constructed with 50% likelihood. Namely, the breach probability at the

individual level coincides with that at the tuple level. This happens because

tuples 6 and 7 appear in the same QI-group. In general, as long as tuples

with identical QI-values always end up in the same QI-group (as is true for

“global-recoding” generalization [8]), the probabilities of the two levels are

always equivalent. In this case, it suces to discuss only the (simpler) tuple

level; as a result, the individual level has not been addressed before (all the

existing generalization schemes adopt global recoding).

Anatomy, however, allows high flexibility in forming QI-groups such that

tuples with the same QI-values do not always belong to the same QI-group.

Database Security: Applications and Trends

Search WWH ::

Custom Search

Home