Privacy in Database Publishing: A Bayesian Perspective - Database Security: Applications and Trends

Databases Reference

In-Depth Information

s is possible only if it is witnessed by some possible database i.e. if there exists

D ∈

( D ). Without worrying yet whether the attacker

can even compute all possible secrets, note that they provide an upper bound

on the set of candidates for the secret which an attacker needs to consider.

Let us denote the set of possible secrets with

[ D ] V such that s =

([ D ] V ):

( D )

D ∈

In particular, the actual secret S ( D ) is a possible secret:

([ D ] V ):=

[ D ] V }

( D )

∈S

([ D ] V ).

Example 3. Continuing Example 2, the possible secrets are obtained by run-

ning the

over each possible database. We obtain s 1 =

( D 1 )=

{

(John, flu),

(Jane, pneumonia)

}

, s 2 =

( D 2 )=

{

(John, flu), (John, pneumonia), (Jane,

flu)

}

,etc.

The optimal attack: compute possible secrets and use external

knowledge. In the absence of external knowledge, possible secrets are indis-

tinguishable with respect to the published data

( D ) and even with unlim-

ited computational resources the best an attacker can hope for is to reverse-

engineer

([ D ] V ). Towards a conservative privacy guarantee, let's assume that

the attacker is successful at this task, handling the case of infinitely many pos-

sible secrets by coming up with a finite representation thereof. 2 If there is only

one possible secret, then the actual secret is exposed and the attacker's task

accomplished. In the (likely) case of several possible secrets, a sophisticated

attacker improves his chances of singling out the actual secret by whittling

down

([ D ] V ) using external knowledge. If several possible secrets remain

even now, the attacker is forced to guess the actual secret among them. How-

ever, the guess does not have to be uneducated: while the attacker's external

knowledge may be insucient to further rule out any possible secrets, it could

still influence the attacker's beliefs about the relative likelihood of the possible

secrets. This would enable the attacker to pick the secret he believes likeli-

est. Finally, if the attacker deemed several possible secrets equally likely but

likelier than all others, he would be forced to guess at random among them.

Modeling attacker's belief. The attacker's external knowledge can per-

tain to the possible databases or exclusively to the possible secrets. Note that

any attacker who forms an opinion on how to rank possible databases can infer

the ranking of the corresponding secrets and is therefore at least as knowl-

edgeable (and dangerous) as an attacker who does not understand or care

about the underlying database, focusing solely on the secret.

To defend against the more dangerous class of attackers, we model the

attacker's a priori belief (i.e. before observing

( D )) as a probability distri-

bution δ on all databases. This induces a belief (probability distribution) P δ

on all secrets as follows: given candidate secret s , the probability P δ [ s ] that s

is the actual secret is the sum of probabilities of all databases witnessing s :

2 We know such representations exist: (an admittedly crude) one is given by the

definition of

together with

( D ).

Database Security: Applications and Trends

Search WWH ::

Custom Search

Home