Databases Reference
In-Depth Information
one minimum support threshold. An Apriori-like optimization technique can be
adopted, based on the knowledge that an ancestor is a superset of its descendants:
The search avoids examining itemsets containing any item of which the ancestors do
not have minimum support.
The uniform support approach, however, has some drawbacks. It is unlikely that
items at lower abstraction levels will occur as frequently as those at higher abstraction
levels. If the minimum support threshold is set too high, it could miss some mean-
ingful associations occurring at low abstraction levels. If the threshold is set too low,
it may generate many uninteresting associations occurring at high abstraction levels.
This provides the motivation for the next approach.
Using reduced minimum support at lower levels (referred to as reduced support ):
Each abstraction level has its own minimum support threshold. The deeper the
abstraction level, the smaller the corresponding threshold. For example, in Figure 7.4,
the minimum support thresholds for levels 1 and 2 are 5% and 3%, respectively. In
this way, “ computer, ” “ laptop computer, ” and “ desktop computer ” are all considered
frequent.
Using item or group-based minimum support (referred to as group-based sup-
port) : Because users or experts often have insight as to which groups are more
important than others, it is sometimes more desirable to set up user-specific, item, or
group-based minimal support thresholds when mining multilevel rules. For example,
a user could set up the minimum support thresholds based on product price or on
items of interest, such as by setting particularly low support thresholds for “ camera
with price over $1000 ” or “ Tablet PC ,” to pay particular attention to the association
patterns containing items in these categories.
For mining patterns with mixed items from groups with different support thresh-
olds, usually the lowest support threshold among all the participating groups is
taken as the support threshold in mining. This will avoid filtering out valuable
patterns containing items from the group with the lowest support threshold. In
the meantime, the minimal support threshold for each individual group should be
kept to avoid generating uninteresting itemsets from each group. Other interest-
ingness measures can be used after the itemset mining to extract truly interesting
rules.
Notice that the Apriori property may not always hold uniformly across all of the
items when mining under reduced support and group-based support. However, efficient
methods can be developed based on the extension of the property. The details are left as
an exercise for interested readers.
A serious side effect of mining multilevel association rules is its generation of many
redundant rules across multiple abstraction levels due to the “ancestor” relationships
among items. For example, consider the following rules where “ laptop computer ” is an
ancestor of “ Dell laptop computer ” based on the concept hierarchy of Figure 7.2, and
 
Search WWH ::




Custom Search