Two–Phase Iterative Algorithm - Association Rule Hiding for Data Mining

Database Reference

In-Depth Information

Fig. 15.3: The two phases of iteration for the considered example.

The primary bottleneck that was experienced in most experiments involved the

time that was taken to run the frequent itemset mining algorithm, as well as the time

needed to solve the formulated CSPs through the application of BIP. In all tested

settings, the thresholds of minimum support were properly selected to ensure an

adequate amount of frequent itemsets and the sensitive itemsets to be hidden were

selected randomly among the frequent ones. All the experiments were conducted on

a PC running Linux on an Intel Pentium D, 3.2 Ghz processor equipped with 4 GB

of main memory. All integer programs were solved using ILOG CPLEX 9.0 [36].

A limiting factor of ` = 5 was used in all presented experiments in order to con-

trol the number of iterations that the algorithm is allowed to execute. After the exe-

cution of the first iteration of the two-phase iterative algorithm, the attained solution

as well as the respective impact on the dataset are being recorded. Then, the two-

phase iterative algorithm proceeds to up to ` subsequent iterations. If the algorithm

fails to identify an exact solution after the ` runs, then the stored solution from the

first iteration of the algorithm is selected to produce the sanitized database D. This

way, both the runtime of execution of the two-phase iterative algorithm is limited

to a level that it remains tractable (through the use of `) and also the algorithm is

adapted so that it constantly outperforms the inline algorithm of [23] and provides

superior hiding solutions. In what follows, let notation ab denote the hiding of a

itemsets of length b, i.e. 21 refers to the hiding of two items.

Search WWH ::

Custom Search

Home