Database Reference
In-Depth Information
and Splat, to discover exact dense frequent subgraphs in a set of relational graphs.
For large-scale graph database mining, a disk-based frequent graph mining method
was introduced by Wang et al. [ 40 ]. Jin et al. [ 25 ] proposed an algorithm, TSMiner ,
for mining frequent large-scale structures (defined as topological structures) from
graph datasets.
For a comprehensive introduction on basic graph pattern mining algorithms in-
cluding Apriori-based and pattern-growth approaches, readers are referred to the
survey written by Washio and Motoda [ 43 ] and Yan and Han [ 47 ].
2.4
Closed and Maximal Subgraphs
A major challenge in mining frequent subgraphs is that the mining process often
generates a huge number of patterns. This is because if a subgraph is frequent,
all of its subgraphs are frequent as well. A frequent graph pattern with n edges can
potentially have 2 n frequent subgraphs, which is an exponential number. To overcome
this problem, closed subgraph mining and maximal subgraph mining algorithms
were proposed.
Definition 13.4 (Closed Subgraph) A subgraph g is a closed subgraph in a graph
set D if g is frequent in D and there exists no proper supergraph g such that g
g
and g has the same support as g in D.
Definition 13.5 (Maximal Subgraph) A subgraph g is a maximal subgraph in a
graph set D if g is frequent, and there exists no supergraph g such that g g and
g is frequent in D.
The set of closed frequent subgraphs contains the complete information of frequent
patterns; whereas the set of maximal subgraphs, though more compact, usually does
not contain the complete support information regarding to its corresponding frequent
sub-patterns. Close subgraph mining methods include CloseGraph [ 46 ]. Maximal
subgraph mining methods include SPIN [ 23 ] and MARGIN [ 35 ].
2.5
Mining Subgraphs in a Single Graph
While most frequent subgraph mining algorithms assume the input graph data is a
set of graphs D
, there are some studies [ 8 , 16 , 30 ] on mining graph
patterns from a single large graph. Defining the support of a subgraph in a set of graphs
is straightforward, which is the number of graphs in the database that contain the
subgraph. However, it is much more difficult to find an appropriate support definition
in a single large graph since multiple embeddings of a subgraph may have overlaps.
If arbitrary overlaps between non-identical embeddings are allowed, the resulting
support does not satisfy the anti-monotonicity property, which is essential for most
frequent pattern mining algorithms. Therefore, [ 8 , 16 , 30 ] investigated appropriate
support measures in a single graph.
={
G 1 , ... , G n }
Search WWH ::




Custom Search