Databases Reference
In-Depth Information
not provide a lower bound for finding optimal l-diverse anonymizations, they
conjecture NP-hardness as well, and show how to adapt the Incognito Algo-
rithm [12].
Sensitive Data Generalization. There are slight exceptions from as-
sumption Util : an example occurs in [22]. In this work, sensitive data is not
published in the clear, but generalized itself using a function f . The gener-
alization function f exploits a hierarchy among concepts in the sensitive do-
main, treating ancestor concepts as more general than descendant concepts.
For instance, instead of displaying “pneumonia”, the owner may release a
more general concept such as “respiratory tract problems” which in turn is
generalized by “antibiotic-curable ailment”. Evidently the objective in [22] is
to minimize the information loss resulting from generalization of both quasi-
identifiers and sensitive attributes. We can capture this scenario as well in the
GBP model, by simply adjusting assumption Util to state that the owner is
willing to live with the attacker's belief after seeing the generalized sensitive
values described by view V s ( R ):= f ( Π S ( R )).
T-Closeness. One paper that explicitly states and exploits assumption
Util is [14]. It considers the probability distribution p on the secrets
{S r } r∈R
after seeing the entire anonymized table
A g ( R ), and the probability distri-
bution q of the sensitive values in R , i.e. in V s ( R ). The authors introduce
the privacy guarantee of t-closeness , which holds if the distribution distance
between p and q is smaller than a parameter threshold t . The authors show
shortcomings of standard metrics for comparing distributions and propose
their own. They also show that the search for a t-close anonymization that
maximizes utility (under a standard measure) can be performed by adapt-
ing ecient algorithms developed for k-anonymity. However, t-closeness does
not subsume k-anonymity and the authors suggest combining the two before
releasing an anonymized table.
An Alternative Bayesian Modeling. [17] compares the notion of l-
diversity to a model called Bayesian Optimal Privacy (BOP) model. Just like
the GBP model, the BOP model is based on belief revision. However, the
authors conclude a mismatch between l-diversity and the BOP model. As
demonstrated in this section, the reason is not due to any fundamental mis-
match between Bayesian privacy models and l-diversity. Rather, it stems from
the particular modeling choice in [17] which ignores assumption Util : [17] con-
siders that a priori the attacker sees V id ( R ) but not V s ( R ). The diculty with
this modeling (identified in [17] as well) is that to estimate the attacker's a
priori belief revision about
S r , we require knowledge of the attacker's proba-
bility distribution on the domain of all sensitive values, which is an unrealistic
expectation. The modeling we describe in this section surmounts this obstacle,
as under assumption Util , it needn't care about this distribution; it only con-
siders belief revision starting from the attacker's adjusted belief after seeing
V s ( R ). We can estimate this belief (as in (5)), regardless of the belief before
seeing V s ( R ).
Search WWH ::




Custom Search