Database Reference
In-Depth Information
are involved in the operation of each of these two approaches and demonstrate why
the Max-Min 2 variant outbalances Max-Min 1 in the majority of hiding scenarios.
11.1 Hiding a Sensitive Itemset
Similarly to the border based algorithm of Sun & Yu [66, 67] (covered in Chapter
10), the Max-Min algorithms proposed by Moustakides & Verykios [50, 51] aim to
hide the sensitive itemsets in a way that minimizes the impact of the sanitization
process to the itemsets of the revised positive border. Both algorithms take as input
the itemsets of the revised positive border and the sensitive itemsets that participate
to the revised negative border. Assuming that a sensitive itemset I needs to be hidden
from the original database D O , the Max-Min algorithms have to determine the item
i 2 I that will be modified in selected transactions of the database to accommodate
the hiding of itemset I. Each item of a sensitive itemset is called a tentative victim
item, since it can be used for hiding this itemset. Among the tentative victim items,
the one that is selected by the hiding algorithms (here, for example, item i) is called
the victim item, since it will be deleted from transactions of the original database to
facilitate the hiding of the sensitive itemset.
The deletion of an item i2I from transactions of the original database, as already
discussed in Chapter 10, may affect itemsets that belong to the revised positive bor-
der, constituting some of them infrequent in the sanitized database. To limit this
side-effect, for every tentative victim item j 2 I the Max-Min algorithms compute
the set of itemsets Bd + j j of the revised positive border Bd + that contain this item.
The itemsets in Bd + j j are termed tentative victim itemsets since their status (fre-
quent vs. infrequent) may be affected by the deletion of item j. From each computed
set Bd + j j , j 2 I, the itemsets with the minimum support (henceforth referred to as
the minimum support itemsets) are selected, since they lie closest to the borderline
between the negative and the positive border and, thus, are the most vulnerable ones
from being accidentally lost. Among all minimum support itemsets coming from the
Bd + j j sets for the different items j 2 I, the Max-Min algorithms select the itemset
that has the maximum support. The rationale behind this criterion is that this item-
set, termed as the max-min itemset, has the maximum distance among all minimum
support itemsets from the borderline. The max-min itemset determines the tentative
victim item that will be deleted from transactions of the database to facilitate the
hiding of the sensitive itemset I. The goal of the Max-Min algorithms is to modify
the victim item that is indicated by the max-min itemset, in a way that minimally
affects the support of this itemset. When more than one max-min itemsets exist,
the corresponding set of itemsets, known as the max-min set, reflects the possible
victim items that can be used to hide the same sensitive itemset. At this point, the
two Max-Min algorithms operate in a different way to select the victim item from
the tentative ones and proceed to its sanitization.
Search WWH ::




Custom Search