Databases Reference
In-Depth Information
Once a data block is released, it is no longer possible to go back and increase
the level of generalization. On the other hand, new releases may sharpen an
attacker's view of the data and may make the overall data set more suscep-
tible to attack. For example, when different views of the data are released
sequentially, then one may use a join on the two releases [109] in order to
sharpen the ability to distinguish particular records in the data. A technique
discussed in [109] relies on lossy joins in order to cripple an attack based on
global quasi-identifiers. The intuition behind this approach is that if the join
is lossy enough, it will reduce the confidence of the attacker in relating the
release from previous views to the current release. Thus, the inability to link
successive releases is key in preventing further discovery of the identity of
records.
3.4 The l -diversity Method
The k -anonymity is an attractive technique because of the simplicity of the
definition and the numerous algorithms available to perform the anonymiza-
tion. Nevertheless the technique is susceptible to many kinds of attacks espe-
cially when background knowledge is available to the attacker. Some kinds of
such attacks are as follows:
Homogeneity Attack: In this attack, all the values for a sensitive at-
tribute within a group of k records are the same. Therefore, even though
the data is k -anonymized, the value of the sensitive attribute for that group
of k records can be predicted exactly.
Background Knowledge Attack: In this attack, the adversary can use
an association between one or more quasi-identifier attributes with the
sensitive attribute in order to narrow down possible values of the sensitive
field further. An example given in [77] is one in which background knowl-
edge of low incidence of heart attacks among Japanese could be used to
narrow down information for the sensitive field of what disease a patient
might have.
Clearly, while k -anonymity is effective in preventing identification of a record,
it may not always be effective in preventing inference of the sensitive values
of the attributes of that record. Therefore, the technique of l -diversity was
proposed which not only maintains the minimum group size of k , but also
focuses on maintaining the diversity of the sensitive attributes. Therefore, the
l -diversity model [77] for privacy is defined as follows:
Definition 2. Let a q -block be a set of tuples such that its non-sensitive val-
ues generalize to q .Aq -block is l-diverse if it contains l ”well represented”
values for the sensitive attribute S. A table is l-diverse, if every q -block in it
is l-diverse.
A number of different instantiations for the l -diversity definition are discussed
in [77]. We note that when there are multiple sensitive attributes, then the
Search WWH ::




Custom Search