Database Reference
In-Depth Information
operates by identifying the set of attributes that influence the existence of each sen-
sitive rule the most and then removing them from those supporting transactions that
affect the nonsensitive rules the least.
Chen & Liu [16] present a random rotation perturbation technique to preserve
the multidimensional geometric characteristics of the original database with respect
to task-specific information. As an effect, in the sanitized database the sensitive
knowledge is adequately protected against disclosure, while the utility of the data is
preserved to a large extend.
Reconstruction-based approaches, inspired by the work of [17, 61] and intro-
duced by Natwichai, et al. [52], offer an alternative to suppression-based techniques.
These approaches target at reconstructing the original database by using only sup-
porting transactions of the nonsensitive rules. As discussed in [71], reconstruction-
based approaches are advantageous when compared to heuristic data modifica-
tion algorithms, since they hardly introduce any side-effects to the hiding process.
They operate as follows. First, they perform rule-based classification to the original
database to enable the data owner to identify the sensitive rules. Then, they construct
a decision tree classifier that contains only nonsensitive rules, approved by the data
owner. The produced database remains similar to the original one, except from the
sensitive part, while the difference between the two databases is proven to reduce as
the number of rules increases.
Natwichai, et al. [53] propose a methodology that further improves the quality
of the reconstructed database. This is accomplished by extracting additional charac-
teristic information from the original database with regard to the classification and
by improving the decision tree building process. Furthermore, with the aid of infor-
mation gain, the usability of the released database is ameliorated even in the case of
hiding many sensitive rules with high discernibility in records classification.
A similar approach to that of [53] was proposed by Katsarou, et al. [40]. The
proposed methodology operates by modifying transactions supporting both sensi-
tive and nonsensitive classification rules in the original database and then using the
supporting transactions of the nonsensitive rules to produce its sanitized counterpart.
4.2 Privacy Preserving Clustering
The area of privacy preserving clustering collects methodologies that aim to pro-
tect the underlying attribute values and thus assure the privacy of individuals who
are recorded in the data, when the data is shared for clustering purposes. Achieving
privacy preservation when sharing data for clustering is a challenging task since the
privacy requirements should be met, while the clustering results remain valid. The
methodologies that have been proposed so far can be separated into two broad cat-
egories: the transformation-based approaches and the protocol-based approaches.
Transformation-based approaches are directly related to the distortion-based ap-
proaches of association rule hiding. They operate by performing a data transforma-
Search WWH ::




Custom Search