Database Reference
In-Depth Information
10.2.2 Hiding a Sensitive Itemset
BBA accomplishes the hiding of a sensitive itemset by applying a series of item
deletions commanded by a set of hiding candidates, each of which results in a decre-
ment of one in the support of the sensitive itemset. By employing the weighting
scheme that was discussed earlier, each itemset of the revised positive border car-
ries a weight that denotes its vulnerability of being affected. By using the supplied
weights, the proposed algorithm has to select the hiding candidate that has minimal
effect on the itemsets of the revised positive border and enforce the corresponding
item deletion. It is important to observe that the hiding of a sensitive itemset may
affect only those itemsets of the revised positive border with which it shares at least
one common item. This is due to the fact that the hiding of a sensitive itemset cor-
responds to a reduction in the support of some of the items it contains, while the
support of the rest of the items (in the universe of all possible items from I) will
remain unaffected. The authors of [66, 67] define the set Bd + j I of possibly affected
itemsets J from the revised positive border Bd + (due to the hiding of an itemset I)
as the affected border, where Bd + j I =fJ j J 2Bd + ^ I\J 6= ?g 3 . Given a hiding
candidate c = (T o ; i o ) for a sensitive itemset I, the border based approach calculates
the impact of deleting c as the sum over the itemsets in the revised positive border
which will be affected by this item's deletion. Clearly, these itemsets are the subset
of those in Bd + j I , which contain item i o . By applying this strategy, the BBA algo-
rithm computes in each iteration the impact of each hiding candidate and selects to
delete the one bearing the minimum impact.
10.2.3 Order of Hiding Itemsets
The order in which the sensitive itemsets from S min are hidden by the border based
algorithm plays an important role in the quality of the constructed sanitized database
D 4 . This is due to the fact that the affected borders for two or more sensitive itemsets
may contain common itemsets of the revised positive border. Assuming two over-
lapping affected borders Bd + j I and Bd + j J (for two sensitive itemsets I; J 2S min ),
enforcing the hiding candidates for I may change the weight of some itemsets in
Bd + j J , thus affect subsequent decisions that are taken for the hiding of J. As a re-
sult, the authors propose that the itemsets in S min are hidden in a decreasing order
of length, splitting ties based on an increasing order of support.
3 It is interesting to contrast this formula to (14.5), which is used by the inline algorithm to select
itemsets whose status (frequent vs. infrequent) needs to be controlled by the hiding algorithm.
4 Sun & Yu [66,67] evaluate the quality of the sanitized database from the viewpoint of lost nonsen-
sitive itemsets in the hiding process. An alternative way of evaluating the quality of the sanitized
database is employed by Gkoulalas-Divanis & Verykios in [23]. The proposed algorithm uses the
number of item deletions that were necessary for the hiding of the sensitive itemsets.
Search WWH ::




Custom Search