Databases Reference
In-Depth Information
Compute leading
k
eigenvectors
of
A
Project back
to cluster the
original data
Clustering in
the new space
Data
Affinity matrix
[
w
ij
]
Av
=
λ
v
A
=
f
(
w
)
Figure11.10
The framework of spectral clustering approaches.
Source:
Adapted from Slide 8 at
http://
videolectures.net/micued08 azran mcl/
.
2.
Using the affinity matrix
W
, derive a matrix
A
D
f
. The way in which this is done
can vary. The Ng-Jordan-Weiss algorithm defines a matrix,
D
, as a diagonal matrix
such that
D
ii
is the sum of the
i
th row of
W
, that is,
.
W
/
n
X
W
ij
.
(11.24)
D
ii
D
j
D1
A
is then set to
A
D
D
2
WD
2
.
(11.25)
3.
Find the
k
leading eigenvectors of
A
. Recall that the
eigenvectors
of a square matrix
are the nonzero vectors that remain proportional to the original vector after being
multiplied by the matrix. Mathematically, a vector
v
is an eigenvector of matrix
A
if
A
v
D
is called the corresponding
eigenvalue
. This step derives
k
new
dimensions from
A
, which are based on the affinity matrix
W
. Typically,
k
should be
much smaller than the dimensionality of the original data.
The Ng-Jordan-Weiss algorithm computes the
k
eigenvectors with the largest
eigenvalues
x
1
,
v
, where
,
x
k
of
A
.
4.
Using the
k
leading eigenvectors, project the original data into the new space defined
by the
k
leading eigenvectors, and run a clustering algorithm such as
k
-means to find
k
clusters.
The Ng-Jordan-Weiss algorithm stacks the
k
largest eigenvectors in columns
to form a matrix
X
D [
x
1
x
2
x
k
] 2
R
n
k
. The algorithm forms a matrix
Y
by
renormalizing each row in
X
to have unit length, that is,
:::
X
ij
q
P
j
D1
X
ij
Y
ij
D
.
(11.26)
The algorithm then treats each row in
Y
as a point in the
k
-dimensional space
R
k
, and
runs
k
-means (or any other algorithm serving the partitioning purpose) to cluster the
points into
k
clusters.