Information Technology Reference
In-Depth Information
Corresponding to R 1 / 2 XC 1 / 2 for the correspondence analysis of a contingency
table X , for the analysis of G we have R = p I and C = L , giving
p 1 / 2 GL 1 / 2
= U V .
(8.1)
The factor p 1 / 2 is superfluous, but we retain it below to maintain the link with simple
CA. As in Chapter 7, the first singular vectors, corresponding to a unit singular value,
may be ignored, being equivalent to working in deviations from the column means. These
vectors are 1 and L 1 / 2 1 which, through the orthogonality properties of singular vectors,
imply that the remaining singular vectors satisfy
1 L 1 / 2 V = 0 .
1 U = 0
and
(8.2)
As in Chapter 7 there are several variants that might be plotted, as summarized in
Table 7.1. These remain available, but the usual choice for MCA is that for chi-squared
distance. We saw in Chapter 7 that this is equivalent to a weighted PCA but the row
weights are now all equal ( R = p I n ) , leading to simplifications, where for approximating
row chi-squared distance we plot
Z 0 = U .
(8.3)
In CA itself we would normally include in the same diagram, approximations to the
column chi-squared distances but, for reasons discussed below, here we represent the
columns by the projected category-level points (CLPs; see Section 8.3):
Z = p 1 / 2 L 1 / 2 V .
(8.4)
In these expressions, the singular value of unity together with the first singular vectors
(8.2) are excluded, so U and V refer to the second and subsequent columns of the SVD
of (8.1). As a simple example of an MCA plot let us construct the above row chi-squared
distance MCA plot for the data in Table 8.2 by issuing the following function call:
MCAbipl(X = MCA.Table.1.data[,-1], e.vects = 1:2, mca.variant =
"indicator", column.points.size = 1.2, row.points.size = 1.2,
pch.col.points = 15, pch.row.points = 16, offset.m =
list(rep(0.4, 20),rep(0.4, 20)), pos.m=list(c(4,4,2,4,4,4,4),
rep(4,15)), row.points.col = "green", column.points.col =
c("red","red","blue","blue","blue","blue","pink","pink",
"pink","brown","brown","brown","black","black","black"))
This call provides the MCA representations in Figures 8.1 - 8.3.
Inspection of Figure 8.1 shows that the weighted centroid of each set of category
points is at the centroid of the row points. Furthermore, each row point is at the vector-
sum of the CLPs referring to its categories. Thus, George is at the vector-sum of Male,
Brown, England, Manual ,and School (see Figure 8.2). These results are generally true
and may be summarized as follows:
Z 0 = GZ ,
1 L k Z k
=
0,
(8.5)
1 Z 0 =
0
.
Search WWH ::




Custom Search