Information Technology Reference
In-Depth Information
no rule exists in Rule Summary 1 for a support beyond 34.21%. Indeed, we found no
rules generated for Rule Summary 1 when x = 35% in our experiments. Also,
Property 4 suggests that no additional rule will be generated in Rule Summary 1 if we
set an antecedent support threshold in the range of [20%, 24.36%]. In our
experiments, we tried setting x = 21%, x = 22%, x = 23%, x = 24% and found that the
same set rules were generated, thus confirms the validity of Property 4.
With the antecedent support threshold fixed at 20%, Property 3 suggests that there
is no point in setting a confidence threshold higher than 68.77%. This is because no
rule exists in Rule Summary 1 for a rule confidence beyond 68.77%. Also, Property 4
suggests that no additional rule will be generated in Rule Summary 1 if we set a
confidence threshold in the range of [60%, 62.76%]. These properties were also
validated in our experiments.
Table 1. Association rules generated from the Online Purchase dataset using 20A60C
Rule ID
Consequent
Antecedent
Support (%)
Confidence (%)
1
I
E
34.21
66.81
2
I
L
24.36
62.76
3
I
M
27.21
68.77
4
A
G
30.79
66.59
5
A
J
29.64
61.93
Table 2. Summaries derived from rules in Table 1
Rule
Summary
Support
Range (%)
Confidence Range
(%)
Consequent
Antecedent
1
I*3
E*1, L*1, M*1
24.36 - 34.21
62.76 - 68.77
2
A*2
G*1, J*1
29.64 - 30.79
61.93 - 66.59
4.3
Properties of Rule Summaries
We now study the properties of Consequent-based Association Rule Summaries.
Table 3 shows a set of rule summaries derived from rules generated from the Online
Purchase dataset. The rules were generated using four different sets of support and
confidence threshold settings.
Table 3 shows that, by fixing the confidence threshold at 60% and gradually
decreasing the antecedent support threshold from 20% to 5%, the number of rules
increases drastically. However, the number of rule summaries only increases
marginally.
Although different threshold settings have been used, the most important rule
summary tends to be the one with consequent 'I', which has the highest consequent
frequency. This is followed by the summary with consequent 'A', and then the
Search WWH ::




Custom Search