Database Reference
In-Depth Information
Algorithm 14.1 Relaxation Procedure in V .
1: procedure S ELECT R EMOVE (Constraints C R , V, D)
2:
C R maxlen S argmax i fjR i jg
. C R i $V i
3:
cr msup min C R maxlen ;i (sup(R i 2V; D))
. R i 2 V
4:
for each c 2C R maxlen do
5:
if sup(R i ; D) = cr msup then
6:
Remove (c)
. remove constraint from the CSP.
7: end if
8: end for
9: end procedure
will always have a solution. This is due to the fact that all inequalities are of the same
type, so in the worst case scenario zeroing out all the involved u nm variables will
hide the sensitive knowledge. However, such a solution is certainly not desirable.
To identify a good hiding solution, the inline algorithm removes a small portion
of the constraints imposed by the itemsets in V (see Figure 14.2) so that the CSP
of the remaining inequalities becomes solvable. The question that is now raised is
how to select the inequalities to remove. Algorithm 14.1 provides a simple heuristic
for selection and removal of inequalities from the CSP of Figure 14.2. It applies a
relaxation process by removing all constraints that correspond to maximal size and
minimum support itemsets in V . The rationale behind this heuristic is that the hidden
itemsets in D (due to the side-effects of hiding itemsets in S max ) will be the first that
would be hidden in D O if the support threshold was increased, since their current
support is also low. After removing a subset of the constraints that participate to the
CSP, the remaining problem will become solvable. Thus, the relaxation process is
iteratively applied to the CSP until a solution is attained. Last, it is important to note
that each repetition of the relaxation process may potentially lead to the removal of
more than one constraints from the original CSP.
14.2.3 An Example
To demonstrate the core approach excluding the relaxation process, we borrow the
example that is used in [23]. Consider the transactional database that is presented in
Table 14.1. Using a minimum frequency threshold mfreq= 0:2, the set of frequent
itemsets that are mined from this database is: F D O =fA; B;C; D; AB; AC; AD;CD;
ACDg. Now suppose that one wishes to hide the sensitive itemset fABg, therefore
S =fABg. As a first step, the inline algorithm computes the ideal set of frequent
itemsets forD by using Algorithm 9.4. That is:F 0 D =fA; B;C; D; AC; AD;CD; ACDg.
The revised positive border for D will then be: Bd + (F 0 D ) =fB; ACDg. To produce
the constraints for the CSP, the inline algorithm has to consider all itemsets appear-
ing in C (see (14.5)). In this example, we have that: V =Bd + (F 0 D ) =fB; ACDg. The
next step is to substitute in all transactions supporting the sensitive itemsets, their
current T nm values of the sensitive items with u nm binary variables. This constitutes
 
Search WWH ::




Custom Search