Database Reference
In-Depth Information
Pontikakis, et al. [58] argue that the main disadvantage of blocking is the fact
that the dataset, apart from the blocked values (i.e., the incorporated unknowns), is
not distorted. Thus, an adversary can disclose the hidden association rules simply by
identifying those generating itemsets that contain question marks and lead to rules
with a maximum confidence that lies above the minimum confidence threshold. If
the number of these rules is small then the probability of identifying the sensitive
ones among them becomes high. To avoid this serious shortcoming of previous ap-
proaches, the authors propose a blocking algorithm that purposely generates rules
that were not existent in the original dataset (i.e., ghost rules) and that their generat-
ing itemsets contain unknowns. Thus, the identification of the sensitive association
rules becomes harder, since the adversary is unable to tell which of the rules that
have a maximum confidence above the minimum threshold are the sensitive and
which are the ghost ones. However, the introduction of ghost rules leads to a decre-
ment in the data quality of the sanitized database. In order to balance the trade-off
between the level of privacy and data utility, the proposed algorithm incorporates a
safety margin that corresponds to the extend of sanitization that can be performed to
the database. The higher the safety margin the better the protection of the sensitive
association rules and the worse the data utility of the resulting sanitized database.
Search WWH ::




Custom Search