Geoscience Reference
In-Depth Information
5.6 THE ASSUMPTIONOF GRAPH-BASEDMETHODS
∗
Remark 5.2.
Graph-Based Semi-Supervised Learning Assumption
The labels are “smooth”
with respect to the graph, such that they vary slowly on the graph. That is, if two instances are
connected by a strong edge, their labels tend to be the same.
The notion of smoothness can be made precise by spectral graph theory. A vector
φ
is an
eigenvector of a square matrix
A
,if
Aφ
=
λφ,
(5.21)
where
λ
is the associated eigenvalue. If
φ
is an eigenvector,
cφ
is an eigenvector, too, for any
c
=
0.
But we will focus on eigenvectors of unit length
φ
=
1. Spectral graph theory is concerned with
the eigenvectors and eigenvalues of a graph, represented by its Laplacian matrix
L
or
L
. We will
analyze the unnormalized Laplacian
L
, which has the following properties:
l
+
u
L
has
l
+
u
eigenvalues (some may be the same) and corresponding eigenvectors
{
(λ
i
,φ
i
)
}
i
=
1
.
These pairs are called the graph spectrum. The eigenvectors are orthogonal:
φ
i
φ
j
=
0 for
i
=
j
.
The Laplacian matrix can be decomposed into a weighted sum of outer products:
l
+
u
λ
i
φ
i
φ
i
.
L
=
(5.22)
i
=
1
The eigenvalues are non-negative real numbers, and can be sorted as
0
=
λ
1
≤
λ
2
≤
...
≤
λ
l
+
u
.
(5.23)
In particular, the graph has
k
connected components if and only if
λ
1
=
...
=
λ
k
=
0. The
corresponding eigenvectors are constant on individual connected components, and zero else-
where, as the following example shows.
Example 5.3.
The Spectrum of a Disconnected Chain Graph
Figure 5.4(a) shows a discon-
nected, unweighted chain graph with 20 vertices. Its spectrum is shown in Figure 5.4(b). The stem
plots are the corresponding eigenvectors
φ
i
. Note
λ
1
=
λ
2
=
0 since there are two connected com-
ponents. The eigenvectors
φ
1
,φ
2
are piecewise constant, whose height is determined by length
normalization. As the eigenvalues increase in magnitude, the corresponding eigenvectors become
more and more rugged.
l
+
u
. This
Because the eigenvectors are orthogonal and have unit length, they form a basis in
R
means any
f
on the graph can be decomposed into
l
+
u
f
=
a
i
φ
i
,
(5.24)
i
=
1