Privacy in Database Publishing: A Bayesian Perspective - Database Security: Applications and Trends

Databases Reference

In-Depth Information

Note that V s is defined under multi-set semantics (it preserves duplicates),

thus revealing the distribution of sensitive values in the underlying popu-

lation for the benefit of statistical studies.

In addition, the owner contemplates a new data release: the table R

anonymized using publishing function

V s ( R ):=

{{

t : S

∈

R, t [ S ]= r [ S ]

}}

A g which associates anonymized

quasi-identifiers with clear sensitive values. 3

Under assumption Util , the owner is not concerned about the attacker's

belief revision caused by seeing the sensitive values. The only revision she

wishes to bound is caused by considering

A g ( R ) on top of V s ( R ). To this

end, we adopt the following convention: a priori every attacker has access

to views V id ( R )and V s ( R ). We denote with

the publishing function

given by the pair of views V id ,V s . A posteriori refers to having released

A g ( R ) on top of

( R ).

•

For each proprietary tuple r

R , both the identity value r [ ID ] and the

sensitive value r [ S ] are known a priori to the attacker via views V id ,re-

spectively V s . The attacker is uncertain only about whether the two are

associated in R . To hide this association from the attacker, the owner de-

clares as secret the boolean query that checks the existence of some tuple

r ∈

∈

R which witnesses the association:

( r ∈

R ) r [ ID ]= r [ ID ]

r [ S ]= r [ S ] .

S r :=

∃

∧

Note that the secret does not include the quasi-identifier attributes, as by

assumption A2 , these are known for every identifier anyway (via V id ).

•

Under assumption A3 , the owner guards only against a single type of at-

tackers, namely those who for lack of additional external knowledge deem

all possible databases equally likely. We model these attackers by the uni-

form probability distribution u on possible databases.

Denote the multiplicity of sensitive value s in table X with mult( s, X ).

Then it is easy to verify that, under assumptions A1 , A2 ,and A3 , the prob-

ability that id = r [ ID ] is associated to s = r [ S ]in R (i.e. that secret

S r

mult ( s,R )

|R|

holds) is a priori (i.e. after seeing

( R )) given by

. The a posteriori

mult ( s, [ r g )

probability (after seeing

. It follows that g offers the

following guarantee of bounded belief revision for secret

A g ( R )) equals

[ r ] g |

S r :

mult( r [ S ] , [ r ] g )

mult( r [ S ] ,R )

BFBR R

{u},S r (

A g ,

−

) .

[ r ] g |

This immediately yields that the anonymization of R via g satisfies the fol-

lowing privacy guarantee:

3 In practice, view V s ( R ) is released simultaneously with anonymized table

A g ( R )

(as its projection on S ), not prior to it. Our modeling is merely a means to capture

assumption Util .

Database Security: Applications and Trends

Search WWH ::

Custom Search

Home