Information Technology Reference
In-Depth Information
the number of generated association rules. For example, one important early idea is to
use multilevel organization and summarization of discovered rules to eliminate
redundant rules [5]. Other early attempts include pruning association rules using the
idea of rule cover and grouping similar association rules using clustering [13]. Such
an approach processes the generated rules to create condensed rules that are easier to
understand. Other proposals along this line of research explore the use of ontologies
[7, 8] or domain knowledge [2] to prune or group discovered association rules.
Another direction of research focuses on generating non-redundant association
rules. Some researchers use the concept of frequent closed itemsets to generate non-
redundant rules [15], or use the notion of representative basis to identify minimum
and unique association rules [6]. More recent proposals on generating non-redundant
rules include the idea of using reliable basis to represent association rules [14]. These
methods can effectively produce non-redundant rules that are a lot smaller than the
rules generated from the traditional approach. However, the number of less-redundant
rules generated is still high from a business end-user point of view.
Fig. 1. Process of Association Rule Mining
The second issue of association rule mining is the difficulty in setting thresholds
for interestingness metrics (e.g., support and confidence measures). Usually, improper
threshold settings can result in generating either too few or too many rules, and in
either case this impedes identification of interesting rules. To tackle this issue, some
researchers (e.g., [12]) attempt to derive minimum support based on additional
metrics such as lift or conviction measures, but this requires other user-specified input
values. More recent work attempts to avoid user input by deriving the support
threshold from the data alone [9]. However, different subsets of rules may need
different thresholds and a single threshold value may not be able to extract all
interesting rules.
Note that most of the aforementioned work focus on finding a small set of
representative rules to replace redundant rules. Our work in this paper serves a
different purpose---to provide a highly succinct summary of all the rules generated;
and to provide a summary of the interestingness measures, so as to guide users in
setting more appropriate interestingness thresholds. Hence, our proposed method
focuses on the post-processing of rules generated by any association rule mining
methods .
Search WWH ::




Custom Search