A Pragmatic Approach to Summarize Association Rules in Business Analytics Projects - Technologies and Applications of Artificial Intelligence

Information Technology Reference

In-Depth Information

Another interesting aspect is that, when the antecedent support threshold is

reduced, the maximum support value of each rule summary remains unchanged. For

example, the rule summary for consequent 'I' remains at 34.21%, despite the fact that

different antecedent support thresholds were used. This validates Property 3.

When examining a rule summary, the range of its interestingness metric indicates

the appropriateness of the minimum interestingness metric threshold. If a rule

summary has a narrow interestingness metric range, it suggests that the threshold may

be too high and should be adjusted down. If the interestingness metric range of a rule

summary is too wide , it suggests that the threshold may be too low and should be

adjusted up. For example, when the minimum antecedent support threshold is set at

5% in Table 3, the support ranges become quite wide and one might consider

increasing the support threshold.

Notice that Table 3 also illustrates the effect of adjusting the interestingness

thresholds. The effect of increasing the interestingness threshold will reduce

consequent frequency, but narrow the range of interestingness metrics. On the other

hand, decreasing the interestingness threshold will increase consequent frequency, but

widen the range of interestingness metrics.

For the sake of completeness, we have also obtained summaries of rules generated

from the Online Purchase dataset, with the antecedent support threshold fixed at 20%

and confidence threshold reduced gradually from 60% to 55%, then 50% and 45%.

We notice that the rule summaries tend to exhibit the same properties as those

observed in Table 3. Due to the page limit for wiring this paper, we do not show the

rule summaries here, but shall provide the details in a future publication.

4.4

A Case of Using CARS in a Business Analytics Project

Here, we illustrate how the proposed method can be used in conjunction with a

heuristic procedure to help find interesting association rules in the Sales Transaction

dataset. The procedure is as follows:

Step 1: Rule Generation). Given a dataset, define each attribute as both an input and

output (i.e., an attribute can be antecedent in Rule i , and it can be consequent in Rule

j , such that i ≠ j ). Use reasonably low interestingness thresholds to generate as many

rules as possible for summarization.

Step 2: Rule Summary Evaluation). Summarize the rules using CARS. Sort the rule

summaries based on the maximum confidence. Select rule summaries with

consequents that are of pragmatic importance in the domain of application. Evaluate

the impact of rule summaries using rule confidence and antecedent support ranges.

Validate each interesting rule summary by examining the credibility of frequent

antecedents in light of knowledge about the business domain.

Step 3: Result Refinement). Let the selected important consequents remain as both

input and output, and set the rest of the attributes as inputs. If required, adjust (and

Search WWH ::

Custom Search

Home