Database Reference
In-Depth Information
Association rules are implications that hold in a transactional database under
certain user-specified parameters that account for their significance. Significant as-
sociation rules provide knowledge to the data miner as they effectively summarize
the data, while uncovering any hidden relations (among items) that hold in the data.
The term “association rule hiding” has been mentioned for the first time in 1999
by Atallah et al. [10] in a workshop paper on knowledge and data engineering. The
authors of this work studied the problem of modifying the original database in a
way that certain (termed as “sensitive”) association rules disappear without, how-
ever, seriously affecting the original data and the nonsensitive rules. They proposed
a number of solutions like fuzification of the original database, limiting access to the
original data, as well as releasing samples instead of the entire database. Due to the
combinatorial nature of the problem, all of the solutions that the authors proposed,
as well as the vast majority of solutions that have been proposed since then, were
based on heuristics. Heuristic solutions, although computationally efficient and scal-
able, suffer from local optima issues as they require optimizing a specific function
in each step of the algorithm, which however does not guarantee finding an opti-
mal hiding solution to the whole problem. More recently, a new direction of exact
approaches to association rule hiding has been proposed. These approaches have
increased time and memory requirements but in return they offer quality guarantees
on the identified solution.
A Motivating Example
The following example scenario, borrowed from the work of Verykios et al. [72],
complements the real world examples we presented earlier and motivates the neces-
sity of applying association rule hiding algorithms to protect sensitive association
rules against disclosure. Let us suppose that we are negotiating with Dedtrees Pa-
per Company, as purchasing directors of BigMart, a large supermarket chain. They
offer their products in reduced prices, provided that we agree to give them access
to our database of customer purchases. We accept the deal and Dedtrees starts min-
ing our data. By using an association rule mining tool, they find that people who
purchase skim milk also purchase Green Paper. Dedtrees now runs a coupon mar-
keting campaign offering a 50 cents discount on skim milk with every purchase of
a Dedtrees product. The campaign cuts heavily into the sales of Green Paper, which
increases the prices to us, based on the lower sales. During our next negotiation
with Dedtrees, we find out that with reduced competition they are unwilling to offer
to us a low price. Finally, we start losing business to our competitors, who were
able to negotiate a better deal with Green Paper. In other words, the aforementioned
scenario indicates that BigMart should sanitize competitive information (and other
important corporate secrets of course) before delivering their database to Dedtrees,
so that Dedtrees does not monopolize the paper market. Similar motivating exam-
ples for association rule hiding are discussed in the work of [18, 54, 66].
Search WWH ::




Custom Search