Database Reference
In-Depth Information
remain infrequent in D. For these reasons, set C is chosen appropriately to consist
of all the itemsets of the revised border. The hybrid hiding algorithm is capable of
ensuring that if the inequalities (16.3) and (16.4) are satisfied for all the itemsets in
set C, then the produced solution is exact and is identical to the solution involving
the whole system of the 2 M 1 inequalities.
16.2.5 Handling Suboptimality
Since an exact solution may not always be feasible, the hybrid hiding algorithm
should be capable of identifying high quality approximate solutions. There are two
possible scenarios that may lead to non-existence of an exact solution. Under the
first scenario, D O itself does not allow for an optimal solution due to the various
supports of the participating itemsets. Under the second scenario, database D O is
capable of providing an exact solution but the size of the database extension is in-
sufficient to satisfy all the required inequalities of this solution.
To tackle the first case, the hybrid hiding algorithm follows the paradigm of the
inline approach [23] by assigning different degrees of importance to different in-
equalities. To be more precise, while it is crucial to ensure that (16.4) holds for all
sensitive itemsets in D, thus they are properly protected from disclosure, satisfac-
tion of (16.3) for an itemset rests in the discretion of ensuring the minimal possible
impact of the sanitization process toD O . This inherent difference in the significance
of the two inequalities, along with the fact that solving the system of all inequalities
of the form (16.4) always leads to a feasible solution (i.e., for any database D O ),
allows the relaxation of the problem when needed and the identification of a good
approximate solution.
To overcome the second issue, the hiding algorithm incorporates the use of a
safety margin threshold, which produces a further expansion of D X by a certain
number of transactions. These transactions must be added to the ones computed by
using (16.1). The introduction of a safety margin can be justified as follows. Since
equation (16.1) provides the lower bound on the size of database D X , it is possible
that the artificially created transactions are too few to accommodate for the proper
hiding of knowledge. This situation may occur due to conflicting constraints im-
posed by the various itemsets regarding their status in D. These constraints require
more transactions (or to be more precise, more item modifications) in order to be
met. Thus, a proper safety margin will allow the algorithm to identify an exact so-
lution if such a solution exists. Moreover, as is demonstrated in Section 16.3.4, the
additional extension of D X , due to the incorporation of the safety margin, can be
restricted to the necessary degree. A portion of the transactions in D X are selected
and removed at a later point, thus reducing its size and allowing an exact solution.
Therefore, the only side-effect of using a safety margin in the hiding process, is an
inflation in the number of constraints and associated binary variables in the problem
formulation, leading to a minuscule overhead in the runtime of the hiding algorithm.
Search WWH ::




Custom Search