Frequent Pattern Mining Algorithms: A Survey - Frequent Pattern Mining

Database Reference

In-Depth Information

For the toy transaction database in Table 2.1 the frequent maximal itemsets at min-

imum support 3 are abcd , e , f , as illustrated in Fig. 2.13 . All the rectangles filled

with grey color represent maximal frequent patterns. As we can see in Fig. 2.4 , that

there are no frequent supersets of abcd , e or f .

Closed Frequent Itemset The closure operator γ induces an equivalence relation

on the power set of items partitioning it into disjoint subsets called equivalence

classes. The largest element with respect to the number of items in each equivalence

class is called a closed itemset. A frequent itemset P is closed if γ ( P )

P . From

the closure property it can be said that both γ ( P ) and P have the same tidset. In

simpler terms, an itemset is closed if it does not have any frequent superset with the

same support. A closed itemset

=

C

can be written as:

C ={

P

|

P

∈ F

and

Q

⊃

P , such that suppor t ( Q )

=

suppor t ( P )

}

Because maximal itemsets have no frequent superset, they are vacuously closed

frequent itemsets. Thus, all maximal patterns are closed. However, there is a key

difference between mining maximal itemsets and closed itemsets. Mining maximal

itemsets loses information about the support of the underlying itemsets. On the

other hand, mining closed itemsets does not lose any information about the support.

The support of the missing subsets can be derived from the closed frequent pattern

database. One way of viewing closed frequent patterns is as the maximal patterns

from each equi-support group of frequent patterns. Closed frequent itemsets are a

condensed representation of frequent itemsets that is lossless.

For the toy transaction database of Table 2.1 the frequent closed patterns are

a , b , abcd , be for minimum support value of 3, as illustrated in Fig. 2.13 . All

the rectangles with dotted border represent closed frequent patterns. The remaining

nodes in the tree (not filled and dotted border) represent frequent itemsets.

5.2

Frequent Maximal Itemset Mining Algorithms

In this subsection, we will discuss some of maximal frequent itemset mining

algorithms.

5.2.1

MaxMiner Algorithm

The MaxMiner algorithm was the first algorithm that used a variety of optimizations

to improve the effectiveness of tree explorations [ 10 ]. This algorithm is generally

focussed on determining maximal patterns rather than all patterns. The author of [ 10 ]

observed that it is usually sufficient to only report maximal patterns, when frequent

patterns are long. This is because of the combinatorial explosion in examining all

subsets of patterns. Although the exploration of the tree is still done in breadth-first

fashion, a number of optimizations are used to improve the efficiency of exploration:

Frequent Pattern Mining

Search WWH ::

Custom Search

Home