Databases Reference
In-Depth Information
The first step in the discovery of association rules is to find each set of
items (called itemset ) that has co-occurrence rate above the minimum support.
An itemset with at least the minimum support is called a large itemset or a
frequent itemset. In this chapter, as in others, the term frequent itemset will
be used. The size of an itemset represents the number of items contained in the
itemset, and an itemset containing k items is called a k -itemset. For example,
beer, diaper can be a frequent 2-itemset. If an itemset is frequent and no proper
superset is frequent, we say that it is a maximally frequent itemset.
Finding all frequent itemsets has received a considerable amount of re-
search effort in all these years because it is a very resource-consuming task.
For example, if there is a frequent itemset with size l , then all 2 l
−
1 non-empty
subsets of the itemset have to be generated.
The set of all subsets of I (the powerset of I ) naturally forms a lattice,
called the itemset lattice [10, 22]. For example, consider the lattice of subsets
of I =
, shown in Fig. 1 (the empty set has been omitted). Each
maximal frequent itemset of the figure is in bold face and in an ellipse.
Due to the downward closure property of itemset support - meaning that
any subset of a frequent itemset is frequent - there is a border, so that all
frequent itemsets lie below the border, while all infrequent itemsets lie above
it. The border of frequent itemsets is shown with a bold line in Fig. 1.
An optimal association mining algorithm must only evaluate the frequent
itemsets traversing the lattice in some way. This one can be done considering
an equivalence class approach. The equivalence class of an itemset a, expressed
as E ( a ), is given as:
{
i 1 ,i 2 ,i 3 ,i 4 }
E ( a )=
{
b :
|
a
|
= k,
|
b
|
= k,Prefix k− 1 ( b )= Prefix k− 1 ( a )
}
,
(1)
where Prefix k ( c ) is the prefix of size k of c , i.e., its k first items in a lexico-
graphical order.
Fig. 1. Lattice of subsets of I = {i 1 ,i 2 ,i 3 ,i 4 }
Search WWH ::




Custom Search