Database Reference
In-Depth Information
Algorithm 13.2 The Intelligent Sanitization Approach of [47].
1: function I NTELLIGENT -S ANITIZE (Database D S , sensitive itemsets S)
2: for each transaction T i 2D S do
3: identify all itemsets S j 2 S that it supports
4: while S j 6=? do
5: remove the item of T i that appears most often in S j
6: remove the itemsets of S j that contain this item
7: end while
8: end for
9: Return: sanitized transactions T i
10: end function
responding to sensitive itemsets I 1 ; I 3 ) and fc 2 ; c 4 ; c 5 g (corresponding to sensitive
itemsets I 2 ; I 4 ; I 5 ).
A last remark about the proposed decomposition strategy, is the fact that it op-
erates in a way that has no impact to the quality of the identified solution of the
original CSP. However, there may exist hiding scenarios in which the correspond-
ing constraints-by-transactions matrix cannot be decomposed. For such cases the
authors of [47] discuss some quick procedures that can be employed to tackle the
problem, however at a cost to the quality of the attained solution. Moreover, in Chap-
ter 17 we discuss a framework that can be employed for the decomposition and the
parallelization of CSPs that are produced by exact hiding algorithms.
13.2 Heuristic Part
The solution of the CSP that was formulated in the exact part of Menon's approach
yields a set of transactions from D O that have to be sanitized in order to conceal
the sensitive knowledge. In this section, we shed light on the actual process that is
followed by [47] for the sanitization of these transactions. The authors present two
simple heuristic strategies that take as input the transactions that were marked for
sanitization based on the solution of the CSP, and output a database D in which the
sensitive knowledge S from D O is properly hidden.
The first strategy, known as the blanket approach, is inspired from the work of
Oliveira & Zaïane in [54] and it operates by deleting all items except from one,
from each transaction that was previously marked for sanitization. Although this
strategy achieves to hide all the sensitive knowledge from D O , it generally leads to
significant side-effects being unnecessarily introduced to D.
The second strategy, called the intelligent approach, induces significantly less
harm to the original database by focusing on retaining the majority of items which
appear in each transaction that is marked for sanitization. Specifically, the algorithm
operates as follows. First, for each transaction T i 2D S , the proposed algorithm finds
the set of sensitive itemsets S j from S that this transaction supports. To sanitize the
transaction, the intelligent approach selects to remove the item of this transaction
 
Search WWH ::




Custom Search