Database Reference
In-Depth Information
4.1
Problem Definition
Given a collection of graphs D and a similarity threshold α
[0, 1], a subset of graphs
R
D is α -orthogonal with respect to D iff for any G a , G b R
, sim ( G a , G b )
α
and for any G i
, sim ( G i , G j ) .
Given a collection of graphs D ,an α -orthogonal set
D
\ R
there exists a G j
R
R
D and a similarity
threshold β
[0, 1],
R
represents a graph G
D , provided that there exists some
G a
R
, such that sim ( G a , G )
β . Let Υ (
R
, D )
={
G
|
G
D s.t.
G a
R
, sim ( G a , G )
β
}
, then
R
is a β -representative set for Υ (
R
, D ).
Given D and
R
, the residue set of
R
is the set of unrepresented patterns in D ,
denoted as
.
The problem defined in [ 2 ] is to find the α -orthogonal, β -representative set for
the set of all maximal frequent subgraphs
(
R
, D )
=
D
\{ R
Υ (
R
, D )
}
which minimizes the residue set size.
The mining problem can be decomposed into two subproblems of maximal subgraph
mining and orthogonal representative set generation , which are discussed separately.
Algorithm 18 shows the algorithm framework of ORIGAMI .
M
4.2
Randomized Maximal Subgraph Mining
As the first step, ORIGAMI mines a set of maximal subgraphs, on which the α -
orthogonal, β -representative graph pattern set is generated. This is based on the
observation that the number of maximal frequent subgraphs is much fewer than that
of frequent subgraphs, and the maximal subgraphs provide a synopsis of the frequent
ones to some extent. Thus it is reasonable to mine the representative orthogonal pat-
tern set based on the maximal subgraphs rather than the frequent ones. However,
Search WWH ::




Custom Search