Information Technology Reference
In-Depth Information
where we scale the eigenvectors to give
L
WL
=
I
, noting that although the eigenvectors
are always orthogonal, their scaling is arbitrary. Our choice of scaling gives
L
L
=
−
1
and
LL
WLL
=
LIL
or
LL
=
W
−
1
, as required.
Given a sample
x
(
x
1
,
x
2
,
, the transformed values
y
=
x
L
of the
p
variables
are said to be
canonical variables
and the
p
-dimensional space generated by the rows of
L
is called the
canonical space
. In particular, the data matrix
X
transforms to
XL
and the
group means
X
transform to
XL
in the canonical space. The transformed group means
XL
are the means of the canonical variables and are called the
canonical means
for
short. Furthermore, the inner product associated with the canonical means is
XLL
X
...
,
x
p
)
=
XW
−
1
X
, confirming the Mahalanobis distances between the means
X
. Thus we have
a property that is of practical consequence: Mahalanobis distances between the means
X
are represented in the canonical space as ordinary Pythagorean distances between the
canonical means.
Since
L
is nonsingular, the rank of the matrix
XL
is the same as the rank
of
X
:
K
×
p
. However, the weighted sum of the rows of
X
vanishes so rank(
X
)
≤
min(
K
−
1,
p
)
. Therefore, the points given by the
K
rows of
XL
will occupy at most
m
=
min(
K
−
1,
p
)
dimensions of the canonical space and may be approximated in fewer
dimensions by PCA (Chapter 3) using the SVD of
XL
. The whole process of approximat-
ing
X
in the canonical space may thus be regarded as a two-step process of first finding
L
as the scaled solution to (4.3) and then in the second step of finding the SVD of
XL
.
We may proceed directly to the eigenvalue decomposition:
L
X
CXL
(
)
V
=
V
,
(4.4)
where
C
represents a centring operation (discussed below) and the columns of
V
are
orthogonal eigenvectors. For the reasons stated above, this equation can have only
m
non-
zero eigenvalues, implying zero coordinates for the means in the remaining dimensions.
We note that equation (4.4) may be written
LL
(
X
CX
)
LV
=
LV
or
X
CX
(
LV
)
=
W
(
LV
)
,
(4.5)
which is a two-sided eigenvalue problem with eigenvectors
M
=
LV
. The eigenvec-
tors should be normalized so that
M
WM
=
V
L
WLV
=
V
IV
=
I
. The solution
LV
incorporates both the transformation to canonical variables and the PCA orthogonal
transformation
V
, thus subsuming both steps into one calculation. Therefore it is com-
mon practice to use the two-sided eigenvalue form for computation, so avoiding the two
separate steps discussed above. However, we think that the two-step form is the more
informative. Once we have transformed to canonical variables, we are concerned essen-
tially with PCA, and everything said in Chapter 3 remains valid. In particular, the PCA
approximation
XL
to the canonical means
XL
is
(
XL
)
VJV
, giving
X
=
X
(
LV
)
J
(
V
L
−
1
)
=
X
(
MJM
−
1
)
,
(4.6)
indicating, as usual, the dimensionality of the approximation by the diagonal matrix
J
.