Privacy in Database Publishing: A Bayesian Perspective - Database Security: Applications and Trends

Databases Reference

In-Depth Information

value s in t ( t [ S ]= s ), the identities which could be associated with t [ S ]

in the actual database R are those of the tuples in r 's equivalence class:

{

[ r ] g ,c [ ID ]= r [ ID ]

. In Example 9, the attacker concludes that

either Jack or Jill can have bronchitis.

Assumptions on the attacker's knowledge. As introduced in [23, 24],

the defense against linking attacks relies on a few implicit assumptions, also

adopted by follow-up work. We explicitly list them below:

c : ID

∈

}

A1 For every r

R , the attacker knows that r [ ID ] occurs in the database (e.g.

because r [ ID ] identifies an acquaintance or celebrity whose hospitalization

the attacker is aware of).

A2 For every r

∈

R , the attacker knows the value of the quasi-identifier

attributes r [ QI ] (e.g. due to access to some external public database).

A3 The attacker has no additional external knowledge to discriminate among

the possible identities, thus treating them as equi-probable.

Util The owner is willing to live with the privacy breach caused by publish-

ing the projection of R on S in the clear , since this is a minimal utility

requirement for statistical and data mining computations performed by

consumers of the released data.

∈

Note that assumptions A1 and A2 are conservative, and any guaran-

tee holding under them also defends against less informed attackers. In con-

trast, assumption A3 is optimistic and weakens any guarantee, as it ig-

nores attackers who improve their guessing odds by exploiting background

knowledge to discriminate among alternatives. We address below versions of

anonymity which relax this assumption. Finally, regarding assumption Util ,

note that [23] and most of its follow-up work concerns itself with choosing

generalizations of the quasi-identifier attributes so as to minimize information

loss, with the understanding that the sensitive data is released in the clear.

Relationship to GBP Model. We show the connection between the

GBP model and the privacy guarantees offered by an arbitrary anonymization

of a table via generalization. This will enable a comparison to the privacy

guarantees described in Section 3. Moreover, it will allow us to contrast various

anonymization guarantees found in the literature using a uniform framework.

•

In typical studies of generalization, the proprietary database D consists of

a single relation R of schema ( ID,QI,S ).

•

Assumptions A1 and A2 can be modeled by just as well assuming that

the owner (or some other authority) has already published the projection

of R on ID,QI :

V id ( R ):= Π ID,QI ( R ) .

•

In our modeling, we separate the owner's concerns on releasing the sen-

sitive data (none according to assumption Util ) and the quasi-identifier

data (serious concerns, calling for generalization). To this end, we consider

the projection of R on the sensitive attributes S as good as published, by

aview

Database Security: Applications and Trends

Search WWH ::

Custom Search

Home