Java Reference
In-Depth Information
Table 4-1
Example Association Rules
Rule ID
Antecedent
Consequent
Support
Confidence
1
Milk
Bread
.50
.66
2
Ham, Bacon, Bread
Eggs
.14
.45
3
Apples, Grapes
Oranges
.37
.25
4
Cereal, Bananas
Milk
.44
.78
5
Steak
Steaksauce
.19
.39
6
Cakemix, Oil
Eggs
.08
.78
(one category implies an item), or “angus filet mignon implies steak
sauce” (one item implies a category).
The mining function for association provides most of the func-
tionality for specifying inputs to model building and retrieving rules.
As such, JDM does not specify any algorithm settings for association;
however, the most popular algorithm is Apriori. For retrieving rules,
JDM focuses on filtering rules using various criteria; for example,
users may want to see only rules that meet a minimum support or
confidence value. Others may also want rules involving some
minimum number of items in the antecedent for more interesting
rules, or having a specific set of items in the antecedent or consequent.
The rules filter may be simple or complex, depending on the needs of
the user or application. For example, consider the rules in Table 4-1.
If we are interested in only “long” rules, we may select all rules
with length 4 or greater. This returns only one, rule 2. If we are inter-
ested in rules with high support and confidence, we may select all
rules with support > 0.3 and confidence > 0.5. This returns rules 1
and 4. If we are interested in any rules involving the item “milk,” we
may select those containing “milk” in the antecedent or consequent.
This returns rules 1 and 4 again. If we are interested only in what
results in the purchase of eggs, we may select those containing
“eggs” in the antecedent only. This returns rules 2 and 6.
4.6
Clustering
Clustering has been used in customer segmentation, gene and protein
analysis, product grouping, finding taxonomies, and text mining.
Typical goals for clustering can include finding representative cases
Search WWH ::




Custom Search