Information Technology Reference
In-Depth Information
are of more interest. Indeed, writing
x
i
.
and
x
.
j
for row and column totals, the elements
of
R
−
1
/
2
(
X
−
E
)
C
−
1
/
2
can be written as
&
)
x
ij
−
x
i
.
x
.
j
/
n
x
ij
−
x
i
.
x
.
j
/
n
1
√
n
(
+
.
=
x
i
.
x
.
j
/
n
(7.3)
√
x
i
.
x
.
j
R11
C
The hypothesis that the independence model
E
n
describes the observed fre-
quencies in the two-way contingency table
X
satisfactorily can be tested by the Pearson's
chi-squared statistic
=
/
(
observed
−
expected
)
2
2
χ
=
expected
2
(
x
ij
−
x
.
j
x
i
.
/
n
)
=
.
(7.4)
x
.
j
x
i
.
/
n
i
j
Comparing (7.3) with (7.4) shows that
n
1
/
2
times the elements of
R
−
1
/
2
C
−
1
/
2
(
X
−
E
)
2
for a contingency
table; these are therefore sometimes termed the Pearson standardized residuals. Thus, we
may seek to minimize
gives exactly the square roots of the contributions to Pearson's
χ
X
R
−
1
/
2
(
X
−
E
)
C
−
1
/
2
2
,
−
(7.5)
with the usual solution, based on the SVD
R
−
1
/
2
C
−
1
/
2
V
,
(
X
−
E
)
=
U
(7.6)
of setting
X
JV
,
=
U
(7.7)
with
J
the diagonal matrix with units in its first
r
positions. This suggests that, for a
k
-dimensional approximation
X
to the
χ
2
contributions, we plot the first
r
columns of
/
/
2
. Biplots of this inner product allow the identification of those elements
of
X
which diverge from the independence assumption by contributing most, or least, to
Pearson's
χ
1
2
1
U
and
V
2
. Alternative partitions such as
U
,
V
that preserve the inner product are
also permissible.
7.2.2 Approximating the deviations from independence
2
, an alternative possi-
bility is that one wishes to approximate
X
-
E
, the deviations from independence, but
weighted by the inverse square roots of the row and column totals. This requires the
minimization of
1
/
2
1
/
Instead of approximating the Pearson residuals
U
and
V
R
−
1
/
2
X
}
C
−
1
/
2
2
,
{
(
X
−
E
)
−
(7.8)
which is given by
R
−
1
/
2
XC
−
1
/
2
JV
.Now
X
JV
C
1
/
2
R
1
/
2
U
=
U
=
and we may
plot the first columns of
R
1
/
2
U
1
/
2
and
C
1
/
2
V
1
/
2
. The difference between the two