Database Reference
In-Depth Information
Table 10.4 Serial and parallel frequent graph mining algorithms
Type
Acronym
Name
Cite
Serial
AGM
Apriori-based Graph Mining
[ 25 ]
Serial
FSG
Frequent Subgraph Mining
[ 29 ]
Serial
gSpan
Graph-based Substructure Pattern Mining
[ 60 ]
Serial
HSIGRAM
Horizontal Single Graph Miner
[ 30 ]
Serial
MoFa
Molecular Fragment Miner
[ 6 ]
Serial
SUBDUE
Substructure Discovery
[ 24 ]
Serial
VSIGRAM
Vertical Single Graph Miner
[ 30 ]
Parallel
p-MoFa
Parallel MoFa
[ 37 ]
Parallel
p-gSpan
Parallel gSpan
[ 37 ]
Parallel
d-MoFa
Distributed MoFa with Dynamic Load Balancing
[ 14 ]
Parallel
MotifMiner
MotifMiner Toolkit
[ 57 ]
Parallel
MRFSE
MapReduce-based Frequent Subgraph Extraction
[ 35 ]
Parallel
MRPF
MapReduce-based Pattern Finding
[ 34 ]
Parallel
SP-SUBDUE
Static Partitioning SUBDUE
[ 11 ]
Parallel
SP-SUBDUE-2
Static Partitioning SUBDUE 2
[ 47 ]
6
Frequent Graph Mining
Successful algorithms for frequent graph mining (FGM) are often directly related to
those designed for FIM. In this section, we provide a brief overview of some key
serial methods for FGM and then discuss strategies and challenges for parallelizing
FGM. For easy reference, Table 10.4 lists the serial and parallel methods described
in this section.
6.1
Serial Frequent Graph Mining
Frequent graph mining comes with an added computational expense that is not present
in frequent itemset or sequence mining. Determining if one graph is a subgraph of
another is an NP-complete problem, known as the subgraph isomorphism problem .
The cost of pruning infrequent candidates by performing isomorphism checks is very
costly and thus most FGM methods make an effort to reduce the amount of pruning
necessary.
Inokuchi et al. [ 25 ] developed Apriori-based Graph Mining (AGM), which ex-
tended the Apriori algorithm to FGM. During candidate generation, subgraphs are
extended by joining two frequent subgraphs and expanding by one edge in all pos-
sible ways. Kuramochi and Karypis [ 29 ] described the Frequent Subgraph mining
algorithm (FSG), another candidate generation based method. Size- k candidates are
generated by joining two frequent ( k
1)-subgraphs. Subgraphs are joined only if
they share the same ( k
2)-subgraph. When candidates are joined, an edge is grown
by either connecting two existing vertices or adding a new vertex. Infrequent sub-
graphs must be pruned after all size- k candidates are generated. FSG optimizes the
pruning stage by only adding edges during candidate generation which are known to
 
Search WWH ::




Custom Search