Database Reference
In-Depth Information
Chapter 3
Pattern-Growth Methods
Jiawei Han and Jian Pei
Abstract Mining frequent patterns has been a focused topic in data mining re-
search in recent years, with the development of numerous interesting algorithms
for mining association, correlation, causality, sequential patterns, partial periodic-
ity, constraint-based frequent pattern mining, associative classification, emerging
patterns, etc. Many studies adopt an Apriori-like, candidate generation-and-test ap-
proach. However, based on our analysis, candidate generation and test may still be
expensive, especially when encountering long and numerous patterns.
A new methodology, called frequent pattern growth , which mines frequent pat-
terns without candidate generation, has been developed. The method adopts a
divide-and-conquer philosophy to project and partition databases based on the cur-
rently discovered frequent patterns and grow such patterns to longer ones in the
projected databases. Moreover, efficient data structures have been developed for
effective database compression and fast in-memory traversal. Such a methodology
may eliminate or substantially reduce the number of candidate sets to be generated
and also reduce the size of the database to be iteratively examined, and, therefore,
lead to high performance.
In this paper, we provide an overview of this approach and examine its
methodology and implications for mining several kinds of frequent patterns, in-
cluding association, frequent closed itemsets, max-patterns, sequential patterns, and
constraint-based mining of frequent patterns. We show that frequent pattern growth is
efficient at mining large data-bases and its further development may lead to scalable
mining of many other kinds of patterns as well.
Keywords
Scalable data mining methods and algorithms
·
Frequent patterns
·
Associations
·
Sequential patterns
·
Constraint-based mining
Search WWH ::




Custom Search