Database Reference
In-Depth Information
In the sections that follow, we first examine the way that the proposed border
based algorithm weighs the itemsets of the revised positive border to quantify the
impact of an item deletion. Then, we present the process that is followed by BBA
for the hiding of a sensitive itemset with minimal impact on the revised positive
border. Last, we present the ordering scheme that is used by BBA in order to hide
all the sensitive itemsets of S min in the sanitized database D that it produces.
10.2.1 Weighing Border Itemsets
As we presented earlier, the deletion of a hiding candidate may impact itemsets from
the revised positive border, possibly causing some of these items to become lost in
the sanitized database D. To ameliorate this problem, BBA employs a weighting
scheme that allows it to select, at each point, the item deletion that causes minimal
impact on the itemsets of the revised positive border. To achieve this, each itemset
of the revised positive border is weighted based on its vulnerability of being affected
by an applied item deletion. In the proposed weighting scheme, larger weights are
assigned to border itemsets that are more vulnerable and, thus, should have a lower
priority of being affected. The authors of [66, 67] define the weight of a border
itemset in the following way:
( sup (I;D O ) sup (I;D 0 )+1
sup (I;D O ) msup
sup(I;D 0 ) msup+1
;
w(I 2Bd + ) =
l+msupsup(I;D 0 ) ;
0 sup(I;D 0 ) msup
where D 0 is the database during the sanitization process, sup(I;D 0 ) is the current
support of itemset I in the database, and l is an integer with a value greater than the
number of itemsets that participate to the revised positive border. Although we will
not discuss the properties of the proposed weighting function in detail 2 , we briefly
note that this function is designed to (i) encourage item deletions that minimally
affect frequent itemsets of the revised positive border, (ii) prevent item deletions
that will cause frequent itemsets of the revised positive border to be lost, and (iii)
prevent item deletions that will cause an extra loss in the support of already lost
itemsets from the revised positive border (thus try to keep the lost border itemsets
near the revised borderline). Moreover, the rate in which the weights of itemsets that
belong to the revised positive border increases, is designed to allow for maintaining
the relative support of the itemsets in the sanitized database.
modifications as an atomic operation. This means that all possible item modifications are applied
at the same time and thus there is no need to recompute the border.
2 The reader is encouraged to refer to [67] for a thorough discussion of the properties and the
rationale behind the employed weighting scheme.
 
Search WWH ::




Custom Search