Database Reference
In-Depth Information
Fig. 4.9
Pattern metric space
Colossal Pattern
Small Pattern
Fig. 4.10
Pattern Space
Explored by
MARGIN
One interesting feature of
Pattern-Fusion
is that it gives a good approximation
to the complete answer set by favoring colossal patterns over smaller-sized ones and
catching the outliers.
7
Pattern Traversal Approach
An alternative to finding long patterns one at a time is to identify one long pattern,
and try to find all other long patterns by exploring the adjacent ones in the neighbor-
hood. We call this type of long pattern discovery the
Pattern Navigation Approach
.
MARGIN
as proposed in [
23
] is a representative of such an approach (Fig.
4.10
).
MARGIN
is a frequent subgraph mining algorithm to find all maximal frequent
subgraphs. As long patterns form a subset of all the maximal frequent subgraphs,
MARGIN would return all the largest patterns in a graph data set. In a nutshell,
MARGIN
is based on the idea that all maximal subgraphs are adjacent in the pattern
lattice in the sense that any one of them is reachable from another one by a common
child node. The set of all candidate subgraphs which are likely to be maximally
frequent are the set of
n
-edge frequent subgraphs that have a
n
1-edge infrequent
supergraph. Such a set of nodes in the lattice is referred to as the set of
f-cut-nodes
.
Comparing to
Apriori
based algorithms,
MARGIN
greatly improves mining effi-
ciency by exloring a much smaller search space by visiting the pattern lattice around
the
f-cut-nodes
.A
cut
is defined as follows.
+