Information Technology Reference
In-Depth Information
so the basic equations now become
( X E ) z 2 = ρ( R R11 R / n ) z 1 ,
( X E ) z 1 = ρ( C C11 C / n ) z 2 ,
(7.33)
which may be rewritten as
[ R 1 / 2
C 1 / 2 ] C 1 / 2 z 2 = ρ
R 1 / 2 z 1 ρ
R 1 / 2 1
1 Rz 1 )/
(
X
E
)
(
n ,
.
(7.34)
( X E ) R 1 / 2 ] R 1 / 2 z 1 = ρ C 1 / 2 z 2 ρ C 1 / 2 1 ( 1 C ) z 2 / n .
[ C 1 / 2
R 1 / 2 XC 1 / 2 ,
Now,
from
the
usual
orthogonality
of
the
singular
vectors
of
1 Rz 1
1 Cz 2
R 1 / 2
( X E ) C 1 / 2 ] C 1 / 2 1 = 0
we
know
that
and
are
zero,
but
and
C 1 / 2
) R 1 / 2 ] R 1 / 2 1
0 , showing that R 1 / 2 1 and C 1 / 2 1 are also a singular vector
(
X
E
=
pair of R 1 / 2
C 1 / 2
(
X
E
)
with zero as corresponding singular value. Furthermore,
R 1 / 2 XC 1 / 2
( X E ) C 1 / 2 have the same set of singular vectors (apart
from multiplication by 1) and, apart from the singular value associated with the
singular vector pair R 1 / 2 1 and C 1 / 2 1 , also the same singular values. It follows that to
take care of centring, we only need to replace X by X - E in (7.31). Thus, z 1 and z 2
are given by the columns of R 1 / 2 U and C 1 / 2 V respectively, as for chi-squared
distance. However, there is nothing in the correlational criterion to suggest an interest
in chi-squared distance; the aim is purely to derive the quantifications. If we plot the
quantified coordinates of the n samples, G 1 z 1 and G 2 z 2 , then these merely repeat (for
as many times as in the corresponding values of the diagonal of R )the p values of
z 1 and (for as many times as in the corresponding values of the diagonal of C )the
q values of z 2 and nothing is gained. We note also that, by maximizing correlation,
a one-dimensional solution is implied, in which case the scaling by the first singular
value is immaterial and may be ignored altogether, so that it suffices to set z 1 = R 1 / 2 u
and z 2 =
and R 1 / 2
C 1 / 2 v . Multidimensional solutions need additional justification, for example
by appealing to chi-squared distance. Perhaps the main interest in the correlational
derivation is as an introduction to multiple correspondence analysis (Chapter 8).
7.2.6 Approximating the row profiles
Yet another variant comes from focusing on the row profiles, given by R 1 X , suggesting
an interest in fitting
R 1 / 2
{ R 1
X } C 1 / 2
2 ,
( X E )
(7.35)
R 1 / 2 XC 1 / 2
and X
V C 1 / 2 , with plots of R 1 / 2 U
V =
R 1 / 2 U
giving U
(as for
chi-squared distance) for the rows and C 1 / 2 V for the columns. The latter provides axes
that may be calibrated for the row profiles. Actually, we do not get the pure row pro-
files but rather their deviations from the marginal row profile 1 C / n . This follows from
noting that
=
R 1 E
R 1
R11 C
11 C
=
(
/
n
) =
/
n
.
However, proper approximations to the row profiles may be obtained by approximating
the row profiles R 1 X directly, omitting the deviations from the independence model.
Search WWH ::




Custom Search