Database Reference
In-Depth Information
16.3 Hybrid Solution Methodology
In the following sections, we present the way that [26] achieves to minimize the
problem size by regulating the status of only an essential portion of itemsets from
P. Moreover, we present the solution that was proposed regarding the size of the
extension D X and the formulation of the hiding process as a CSP solved by using
BIP. Finally, the critical issues of how to ensure validity of transactions in D and
how to handle suboptimality in the hiding process, are appropriately addressed.
16.3.1 Problem Size Minimization
Cover relations governing the itemsets in the lattice of D O ensure that the formu-
lated set of itemsets C has an identical solution to the one of solving the system of
all 2 M 1 inequalities for D. Since the status of each itemset in C needs to be reg-
ulated in D X , a one-to-one correspondence exists between itemsets and produced
inequalities in the system.
Definition 16.4. (Solution of a system of inequalities from C) Given set C =
fC 1 ;C 2 ; : : : ; C 2 M 1 g, let L C be the set of solutions of the corresponding inequalities
from C. The set of solutions for the system of inequalities will be the intersection of
the solutions produced by each inequality. Thus, L C = T r=1 L C r .
Based on this notation, the set of solutions that corresponds to all itemsets in D
is denoted as L Pnf?g . Given an inequality, for instance C 2 , produced by an itemset
I with a solution set being a proper subset of the solution set of another inequality,
say C 1 , one can deduce that the latter inequality can be removed from any system
containing the first inequality, without affecting the global solution of this system.
In this case L C 2 L C 1 and C 1 is a (generalized) cover of C 2 .
The following theorems identify itemsets that should be represented in the CSP
formulation and prove that the optimal hiding solution L C in the case of the hybrid
algorithm can be found based on the union of the borders Bd + (F 0 D ) and Bd (F 0 D ).
Theorem 16.2. If I 2F 0 D then 8J I we have that L I L J .
Proof. Consider the inequality produced by I, which should be frequent in D, i.e.
Q
q=1
i m 2I u qm msupsup(I;D O )
(16.5)
Suppose that this inequality holds for a combination of u qm values, corresponding
to a solution l 2L I . If all the u qm variables in the extension D X are substituted
with their values from l, then every bit will gain a specific value. Let D I denote the
supporting transactions of itemset I in database D and index p 2 [1; P] be used to
Search WWH ::




Custom Search