Database Reference
In-Depth Information
the oscillation of a pendulum, where one stage follows the other with the hope that
the computed revised positive border will converge at some point to the ideal one.
We now proceed to analyze the mechanism that takes place as part of the second
phase of this “oscillation”. Since the goal of this phase is to constitute the itemsets
of H frequent in D H , the modification of this database will be based only on item
inclusions on a selected subset of transactions. Following the inline approach [23],
the candidate items for inclusion are only those that appear among the itemsets inH,
i.e. those in the universe ofI H . All these items will be substituted in the transactions
of D H where they are unsupported, by the corresponding u nm variables. This will
produce an intermediate form of database D H . Following that, a CSP is constructed
in which the itemsets that are controlled belong in set
C =H[Bd (F 0 D )
(15.1)
Finally, the optimization criterion is altered to denote that a minimization (i.e. the
least amount of 1s), rather than a maximization of the binary variables u nm will lead
to the best possible solution. Figure 15.2 presents the form of the CSP created in
the second phase of this two-phase iterative hiding process. As mentioned earlier,
these two phases are executed in an iterative fashion until convergence to the exact
hiding solution is achieved or a pre-specified number of oscillations ` take place. In
this figure, D H;fIg refers to the transactions of D H that support itemset I.
minimize å u nm 2U u nm
( å T n 2D H;fIg Õ i m 2X u nm < msup;8I 2Bd (F 0 D )
å T n 2D H;fIg Õ i m 2R u nm msup;8I 2H
subject to
Fig. 15.2: The CSP for the second stage of the two-phase iterative algorithm.
15.2 Removal of Constraints from Infeasible CSPs
An aspect of the two-phase iterative approach that requires further investigation re-
gards the constraints selection and removal process that turns an infeasible CSP into
a feasible one. Ref. [27] considers the eviction process of Algorithm 14.1 in an at-
tempt to maximize the probability of yielding a feasible CSP after the removal of
only a few number of constraints (inequalities). This is achieved by removing the
most strict constraints, i.e. those involving the maximum number of binary variables
(equivalently, the maximum length itemsets fromD O ). To ensure that the removal of
these constraints will cause minimal loss of nonsensitive knowledge to the database,
the selected constraints are the ones that involve itemsets of low support in D O , as
those itemsets would be the first to be hidden if the support threshold was increased.
 
Search WWH ::




Custom Search