Database Reference
In-Depth Information
Table 10.3 Serial and parallel frequent sequence mining algorithms
Type
Acronym
Name
Cite
Serial
AprioriAll
AprioriAll
[ 2 ]
Serial
BIDE
BI-Directional Extension
[ 56 ]
Serial
CloSpan
Closed Sequential Pattern Mining
[ 61 ]
Serial
FreeSpan
Frequent Pattern-projected Sequential Pattern Mining
[ 20 ]
Serial
GSP
General Sequential Patterns
[ 54 ]
Serial
PrefixScan
Prefix-projected Sequential Pattern Mining
[ 43 ]
Serial
SPADE
Sequential PAttern Discovery using Equivalence classes
[ 62 ]
Serial
WAP-miner
Web Access Pattern Miner
[ 42 ]
Parallel
ACME
Advanced Parallel Motif Extractor
[ 50 ]
Parallel
DGSP
Distributed GSP
[ 44 ]
Parallel
DPF
Data Parallel Formulation
[ 18 ]
Parallel
EVE
Event Distribution
[ 27 ]
Parallel
EVECAN
Event and Candidate Distribution
[ 27 ]
Parallel
HPSPM
Hash Partitioned Sequential Pattern Mining
[ 53 ]
Parallel
MG-FSM
Large-Scale Frequent Sequence Mining
[ 38 ]
Parallel
NPSPM
Non-Partitioned Sequential Pattern Mining
[ 53 ]
Parallel
Par-ASP
Parallel PrefixSpan with Sampling
[ 9 ]
Parallel
Par-CSP
Parallel CloSpan with Sampling
[ 10 ]
Parallel
PLUTE
Parallel Sequential Patterns Mining
[ 45 ]
Parallel
pSPADE
Parallel SPADE
[ 65 ]
of parallelizing these approaches. For easy reference, Table 10.3 lists the serial and
parallel methods described in this section.
5.1
Serial Frequent Sequence Mining
As direct extensions of the Apriori algorithm for the FIM problem, AprioriAll [ 2 ] and
General Sequential Patterns (GSP) [ 54 ], by Srikant and Agrawal, make use of the
downward closure property and follow the same general multi-pass candidate gener-
ation outline as described in Sect. 2 . The join operation is redefined for the sequence
domain s.t. only ( k
2)-prefix are
joined. GSP generalizes the problem definition, introducing time constraints (mini-
mum or maximum time period between elements in a frequent sequence), a sliding
window (sequence itemset items may come from a set of transactions with times-
tamps within a user-specified time window), and user-defined taxonomies (the set of
items I may be provided as a hierarchy and sequential patterns may include items
across all levels of the hierarchy).
Pattern growth methods also exhibit direct extensions to the sequence do-
main. Zaki [ 64 ] developed Sequential PAttern Discovery using Equivalence classes
(SPADE), patterned after their ECLAT [ 62 ] algorithm for FIM, which is introduced
in Sect. 2 . During the first scan of the sequence database, SPADE creates, for each
discovered sequence element (itemset), an id-list containing < sequence id,
timestamp > pairs denoting locations of the element in the database. Candidate
1)-length frequent sequences with the same ( k
 
Search WWH ::




Custom Search