Database Reference
In-Depth Information
minimize
u qm
å q2[1;Q+ SM ];m2[1;M]
8
<
:
å Q+ SM
q=1 Õ i m 2I u qm < thr;8I 2Bd (F 0 D )
subject to
å Q+ SM
q=1 Õ i m 2I u qm thr;8I 2Bd + (F 0 D )
where
thr = mfreq(N +Q+SM)sup(I;D O )
Fig. 16.1: CSP formulation as an optimization process.
Based on (16.7) the itemsets of the revised borders Bd + (F 0 D ) and Bd (F 0 D ) can
be used to produce the inequalities, which will lead to an exact hiding solution.
16.3.2 Adjusting the Size of the Extension
Equation (16.1) provides the absolute minimum number of transactions that need
to be added in D X , to allow for the proper hiding of the sensitive itemsets of D O .
However, this lower bound can, under certain circumstances, be insufficient to allow
for the identification of an exact solution 1 , even if one exists [26]. To circumvent this
problem, one needs to expand the size Q ofD X as determined by (16.1), by a certain
number of transactions. A threshold, called Safety Margin (SM) is incorporated for
this purpose. Safety margins can be either predefined or be computed dynamically,
based on particular properties of database D O and/or other parameters regarding the
hiding process. In any case, the target of using a safety margin is to ensure that an
adequate number of transactions participate in D X , thus an exact solution (if one
exists) will not be lost due to the small size of the produced extension.
Since for each transaction in D X , M new binary variables are introduced that
need to be tuned when solving the system of inequalities from C, one would ide-
ally want to identify a sufficiently large number of transactions for D X (that allow
for an exact solution), while this number be as low as possible to avoid unneces-
sary variables and constraints participating in the hiding process. Supposing that the
value of Q is adjusted based on (16.1) and a sufficiently large safety margin is used,
the methodology of Section 16.3.4 minimizes the size of D X after the sanitization
process to allow for an ideal solution.
1 On the contrary, the lower bound is always sufficient to allow for an approximate solution of the
set of inequalities produced by the itemsets in C.
 
Search WWH ::




Custom Search