Databases Reference
In-Depth Information
value s in t ( t [ S ]= s ), the identities which could be associated with t [ S ]
in the actual database R are those of the tuples in r 's equivalence class:
{
[ r ] g ,c [ ID ]= r [ ID ]
. In Example 9, the attacker concludes that
either Jack or Jill can have bronchitis.
Assumptions on the attacker's knowledge. As introduced in [23, 24],
the defense against linking attacks relies on a few implicit assumptions, also
adopted by follow-up work. We explicitly list them below:
c : ID
|
r
}
A1 For every r
R , the attacker knows that r [ ID ] occurs in the database (e.g.
because r [ ID ] identifies an acquaintance or celebrity whose hospitalization
the attacker is aware of).
A2 For every r
R , the attacker knows the value of the quasi-identifier
attributes r [ QI ] (e.g. due to access to some external public database).
A3 The attacker has no additional external knowledge to discriminate among
the possible identities, thus treating them as equi-probable.
Util The owner is willing to live with the privacy breach caused by publish-
ing the projection of R on S in the clear , since this is a minimal utility
requirement for statistical and data mining computations performed by
consumers of the released data.
Note that assumptions A1 and A2 are conservative, and any guaran-
tee holding under them also defends against less informed attackers. In con-
trast, assumption A3 is optimistic and weakens any guarantee, as it ig-
nores attackers who improve their guessing odds by exploiting background
knowledge to discriminate among alternatives. We address below versions of
anonymity which relax this assumption. Finally, regarding assumption Util ,
note that [23] and most of its follow-up work concerns itself with choosing
generalizations of the quasi-identifier attributes so as to minimize information
loss, with the understanding that the sensitive data is released in the clear.
Relationship to GBP Model. We show the connection between the
GBP model and the privacy guarantees offered by an arbitrary anonymization
of a table via generalization. This will enable a comparison to the privacy
guarantees described in Section 3. Moreover, it will allow us to contrast various
anonymization guarantees found in the literature using a uniform framework.
In typical studies of generalization, the proprietary database D consists of
a single relation R of schema ( ID,QI,S ).
Assumptions A1 and A2 can be modeled by just as well assuming that
the owner (or some other authority) has already published the projection
of R on ID,QI :
V id ( R ):= Π ID,QI ( R ) .
In our modeling, we separate the owner's concerns on releasing the sen-
sitive data (none according to assumption Util ) and the quasi-identifier
data (serious concerns, calling for generalization). To this end, we consider
the projection of R on the sensitive attributes S as good as published, by
aview
Search WWH ::




Custom Search