Information Technology Reference
In-Depth Information
with z = 0 . To avoid this, we either fix the total sum of squares to 1, say, or express the
criterion as a ratio,
z G Gz
z Lz
,
in which case the scaling is arbitrary (though for convenience usually fixed to z Lz
1).
In this ratio, z Lz represents the total sum of squares, while the numerator is the sum of
squares between the row totals Gz . The maximization of the ratio requires the solution
to the two-sided eigenvalue problem
=
G Gz
= λ
Lz ,
which may be written as
( L 1 / 2 G GL 1 / 2
) L 1 / 2 z = λ L 1 / 2 z ,
where, apart from the irrelevant factor p 1 , the matrix in parentheses is the normalized
Burt matrix. Thus, we arrive back at our previous MCA eigenvalue problem. The eigen-
vector L 1 / 2 z may be normalized in the usual way so that z Lz = 1, in which case the
sum of squares of the row scores z G Gz = λ , which is maximized by taking the largest
nonunit eigenvalue. The deviations from the mean have been ignored in this derivation,
but we have already seen when discussing CA that these are accounted for by the vectors
associated with the rejected unit eigenvalue.
Just as with the correlational approach to CA (Section 7.2.5), the homogeneity analysis
criterion seeks only a one-dimensional solution but the multidimensional solution may
be justified as above.
8.7 Correlational approach
We have seen that the CA of a two-way table may be developed as seeking quantifications
that maximize the correlation between the two categorical variables. This extends to
MCA. One way of developing canonical correlation (CCA) for quantitative variables in
data matrices X 1 , X 2 is to find linear transformations Z 1 , Z 2 that minimize
2
X 1 Z 1
X 2 Z 2
normalized so that diag( Z 1 X 1 X 1 Z 1 + Z 2 X 2 X 2 Z 2 ) = 2 I (see Gower and Dijksterhuis,
2004, for a fuller discussion of appropriate constraints and the links with other for-
mulations of CCA). Apart from trivial factors of proportionality, this criterion may be
written as
2
2
X k Z k
M
1
2
2
,
w e M
=
X k Z k ,
k =
k =
1
1
to be minimized with the constraint diag( Z 1 X 1 X 1 Z 1 + Z 2 X 2 X 2 Z 2 ) = 2 I . For indicator
matrices G 1 , G 2 this becomes the minimization of
2
2
G k Z k M
1
2
2
,
w e M =
G k Z k ,
k = 1
k = 1
Search WWH ::




Custom Search