Information Technology Reference
In-Depth Information
Fair
Clerical
Dark
Myfanwy
Scotland
Alisdair
Wales
F
University
School
Jane
Harriet
Brown
Ivor
George
Manual
M
Professional
England
Jeremy
Grey
Postgrad
Figure 8.4 Demonstrating the vector-sum method when row point coordinates are
reduced to p 1 U and columns plotted by using Z = p 1 / 2 L 1 / 2 V as in Figure 8.3.
chi-squared distance between two samples i and i occurs for those variables that differ.
Thus, if the levels of the k th variable differ and the two differing levels have frequencies
l i and l i
then the contribution is
p 2 1
,
1
1
l i
l i +
(8.8)
depending only on the inverse frequencies of the categories. This is indeed a somewhat
strange distance and, at the very least, bears examination in every instance of its use. We
discuss other choices of distance below. The interpretation of column chi-squared distance
between two category levels j and j is even harder to justify. This is because we have
to reconcile the meaning of distances between levels that refer to the same categorical
variable and those that do not (e.g. distance between grey and brown , interpretable as
a mismatch, and distance between grey and university , which is meaningless). For this
reason we do not explore the column chi-squared distance version of MCA. Rather, as we
have seen, the columns (variables) are shown as projected CLPs with their useful centroid
and interpolation properties. This gives an asymmetric representation as is appropriate
for a data matrix.
Search WWH ::




Custom Search