Multiple correspondence analysis - Understanding Biplots

Information Technology Reference

In-Depth Information

with z = 0 . To avoid this, we either fix the total sum of squares to 1, say, or express the

criterion as a ratio,

z G Gz

z Lz

in which case the scaling is arbitrary (though for convenience usually fixed to z Lz

1).

In this ratio, z Lz represents the total sum of squares, while the numerator is the sum of

squares between the row totals Gz . The maximization of the ratio requires the solution

to the two-sided eigenvalue problem

G Gz

= λ

Lz ,

which may be written as

( L − 1 / 2 G GL − 1 / 2

) L 1 / 2 z = λ L 1 / 2 z ,

where, apart from the irrelevant factor p − 1 , the matrix in parentheses is the normalized

Burt matrix. Thus, we arrive back at our previous MCA eigenvalue problem. The eigen-

vector L 1 / 2 z may be normalized in the usual way so that z Lz = 1, in which case the

sum of squares of the row scores z G Gz = λ , which is maximized by taking the largest

nonunit eigenvalue. The deviations from the mean have been ignored in this derivation,

but we have already seen when discussing CA that these are accounted for by the vectors

associated with the rejected unit eigenvalue.

Just as with the correlational approach to CA (Section 7.2.5), the homogeneity analysis

criterion seeks only a one-dimensional solution but the multidimensional solution may

be justified as above.

8.7 Correlational approach

We have seen that the CA of a two-way table may be developed as seeking quantifications

that maximize the correlation between the two categorical variables. This extends to

MCA. One way of developing canonical correlation (CCA) for quantitative variables in

data matrices X 1 , X 2 is to find linear transformations Z 1 , Z 2 that minimize

X 1 Z 1 −

X 2 Z 2

normalized so that diag( Z 1 X 1 X 1 Z 1 + Z 2 X 2 X 2 Z 2 ) = 2 I (see Gower and Dijksterhuis,

2004, for a fuller discussion of appropriate constraints and the links with other for-

mulations of CCA). Apart from trivial factors of proportionality, this criterion may be

written as

X k Z k −

w e M

X k Z k ,

k =

to be minimized with the constraint diag( Z 1 X 1 X 1 Z 1 + Z 2 X 2 X 2 Z 2 ) = 2 I . For indicator

matrices G 1 , G 2 this becomes the minimization of

G k Z k − M

w e M =

G k Z k ,

k = 1

Understanding Biplots

Search WWH ::

Custom Search

Home