Database Reference
In-Depth Information
Fig. 2.2 The lattice of
itemsets
Null
FREQUENT ITEMSETS
a
b
c
d
ab
ac
ad
bc
bd
cd
abc
abd
acd
bcd
INFREQUENT ITEMSETS
abcd
BORDER BETWEEN
FREQUENT AND
INFREQUENT ITEMSETS
first phase. This chapter will also focus on the first phase of frequent pattern mining,
which is generally considered more important and non-trivial.
Frequent patterns satisfy a downward closure property , according to which every
subset of a frequent pattern is also frequent. This is because if a pattern P is a
subset of a transaction, then every pattern P
P will also be a subset of T .
Therefore, the support of P can be no less than that of P . The space of exploration
of frequent patterns can be arranged as a lattice, in which every node is one of the 2 d
possible itemsets, and an edge represents an immediate subset relationship between
these itemsets. An example of a lattice of possible itemsets for a universe of items
corresponding to
is illustrated in Fig. 2.2 . The lattice represents the search
of frequent patterns, and all frequent pattern mining algorithms must, in one way or
another, traverse this lattice to identify the frequent nodes of this lattice. The lattice is
separated into a frequent and an infrequent part with the use of a border . An example
of a border is illustrated in Fig. 2.2 . This border must satisfy the downward closure
property.
The lattice can be traversed with a variety of strategies such as breadth-first or
depth-first methods. Furthermore, candidate nodes of the lattice may be generated
in many ways, such as using joins, or using lexicographic tree-based extensions.
Many of these methods are conceptually equivalent to one another. The following
discussion will provide an overview of the different strategies that are commonly
used.
{
a , b , c , d
}
2
Join-Based Algorithms
Join-based algorithms generate ( k
1)-candidates from frequent k -patterns with the
use of joins. These candidates are then validated against the transaction database.
The Apriori method uses joins to create candidates from frequent patterns, and is
one of the earliest algorithms for frequent pattern mining.
+
Search WWH ::




Custom Search