Information Technology Reference
In-Depth Information
A Pragmatic Approach to Summarize Association
Rules in Business Analytics Projects
Swee Chuan Tan and Boon Hong Sim
SIM University, School of Business
461 Clementi Road, Singapore
jamestansc@unisim.edu.sg, jeffrey.swf@hotmail.com
Abstract. Association rule mining is an important data mining method
primarily used for market basket analysis. However, the method usually
generates a large number of association rules; and it is difficult to use domain-
independent objective measures to help find pragmatically important rules. To
address these issues, we present a general method that succinctly summarizes
rules with common consequent(s). This consequent-based approach allows user
to focus on evaluating a rule set based on the practical significance of
consequent(s) in an application domain, which usually outweighs the
importance of objective measures such as rule confidence. We provide a case
study to demonstrate how the proposed method can be used in conjunction with
a heuristic procedure to find important rules generated from large real-world
data, leading to discovery of important business knowledge and insights.
1
Introduction
Association rule mining is a data mining method that discovers interesting and useful
relationships hidden in data [1]. One common application of association rule mining
is market basket analysis [11], where products that are usually purchased together in a
supermarket can be identified. The relationships of products can then be studied and
insight is then drawn for improving store layout and pricing strategies, or for
designing promotional strategies such as cross-selling or product bundling.
Despite the usefulness of association rule mining, a common problem faced by
end-users is that the method tends to generate too many association rules. When a
large number of rules are generated, many of the rules are redundant because they
convey the same amount of information, or are insignificant because they contain
common knowledge about the business. In most cases, the number of redundant or
trivial rules is a lot more than the number of essential rules, which makes the
discovery of really interesting rules a challenge [3].
Another problem of using association rule mining is the need for users to specify
minimum interestingness thresholds for discovering interesting association rules.
Usually, a threshold is set arbitrarily. Using a threshold that is too low may generate
too many association rules, which are difficult to interpret [4]. On the other hand,
using a threshold that is too high may remove rare rules that could be important for
discovering new information [10].
 
Search WWH ::




Custom Search