Database Reference
In-Depth Information
Chapter 11
Sequential Pattern Mining
Wei Shen, Jianyong Wang and Jiawei Han
Abstract Sequential pattern mining, which discovers frequent subsequences as pat-
terns in a sequence database, has been a focused theme in data mining research
for over a decade. This problem has broad applications, such as mining customer
purchase patterns and Web access patterns. However, it is also a challenging prob-
lem since the mining may have to generate or examine a combinatorially explosive
number of intermediate subsequences. Abundant literature has been dedicated to this
research and tremendous progress has been made so far. This chapter will present a
thorough overview and analysis of the main approaches to sequential pattern mining.
Keywords Sequential
·
pattern
·
mining
1
Introduction
Sequential pattern mining discovers subsequences that appear in a sequence database
with frequency no less than a user-specified threshold. A sequence database stores
a number of records, where all records are ordered sequences of events, with or
without concrete notions of time. Examples of sequences include retail customer
transactions, DNA sequences, and web log data. A subsequence, such as buying first
a PC, then a digital camera, and then a memory card, if it occurs frequently in a
customer transaction database, is a (frequent) sequential pattern.
Sequential pattern mining is an important data mining problem with broad ap-
plications, such as mining customer purchase patterns, identifying outer membrane
proteins, automatically detecting erroneous sentences, discovering block correlations
in storage systems, identifying copy-paste and related bugs in large-scale software
Search WWH ::




Custom Search