Roadmap to FutureWork - Association Rule Hiding for Data Mining

Database Reference

In-Depth Information

Chapter 21

Roadmap to Future Work

There is a plethora of open issues related to the problem of association rule hiding

that are still under investigation. First of all, the emergence of sophisticated exact

hiding approaches of high complexity, especially for very large databases, causes

the consideration of efficient parallel approaches to be employed for improving the

runtime of these algorithms. Parallel approaches allow for decomposition of the

constraints satisfaction problem into numerous components that can be solved in-

dependently. The overall solution is then attained as a function of the objectives of

the individual solutions. A framework for decomposition and parallelization of ex-

act hiding approaches has been recently proposed in [25] and is covered in Chapter

17 of the topic. Although this framework improves the runtime of solving the con-

straints satisfaction problem that is produced by the exact hiding algorithms, we are

confident the further optimizations can be achieved by exploiting the inherent char-

acteristics of the constraints that are involved in the CSP. Also, different optimiza-

tion techniques can be adopted to allow searching the space of possible solutions

for the CSP, in a more advanced way.

Regarding the use of unknowns in blocking association rule hiding algorithms,

a lot more research is in need to provide sophisticated hiding solutions that take

advantage of the capabilities offered by their use. Evidence has shown that the use

of unknowns in several real life scenarios is much more preferable than the use of

conventional distortion techniques. This is true because distortion techniques fail

to provide a distinction between the real values in the dataset and the ones that

were distorted by the hiding algorithm in order to allow for its proper sanitization.

Therefore, it is our belief that future research in association rule hiding should target

towards providing sophisticated and efficient solutions that make use of unknowns.

A different research direction concerns the use of database reconstruction ap-

proaches to generate a database from scratch that is compatible with only the non-

sensitive frequent itemsets or a given set of association rules. Prominent research

efforts towards this direction include the work of several researchers in the field of

inverse frequent set mining [17, 33, 48, 78]. However, it was recently proved that

this is an NP-hard problem [12-14]. On going work considers yet another solution

which is to append to the original database a synthetically generated database part so

Search WWH ::

Custom Search

Home