Databases Reference
In-Depth Information
5 Statistical Independence in
m × n
Contingency Table
Let us consider a
m
n
contingency table shown in Table 2. Statistical inde-
pendence of
R
1
and
R
2
gives the following formulae:
×
P
([
R
1
=
A
i
,R
2
=
B
j
]) =
P
([
R
1
=
A
i
])
P
([
R
2
=
B
j
])
(
i
=1
,
···
,m,j
=1
,
···
,n
)
.
According to the definition of the table,
N
=
k
=1
x
ik
l
=1
x
lj
N
x
ij
×
.
(13)
N
Thus, we have obtained:
x
ij
=
k
=1
x
ik
×
l
=1
x
lj
N
.
(14)
Thus, for a fixed
j
,
=
k
=1
x
i
a
k
x
i
a
j
x
i
b
j
k
=1
x
i
b
k
In the same way, for a fixed
i
,
=
l
=1
x
lj
a
x
ij
a
x
ij
b
l
=1
x
lj
b
Since this relation will hold for any
j
, the following equation is obtained:
=
k
=1
x
i
a
k
x
i
a
1
x
i
b
1
=
x
i
a
2
x
i
b
2
···
=
x
i
a
n
k
=1
x
i
b
k
.
(15)
x
i
b
n
Since the right hand side of the above equation will be constant, thus all the
ratios are constant. Thus,
Theorem 4.
If two attributes in a contingency table shown in Table 2 are
statistical indepedent, the following equations hold:
x
i
a
1
x
i
b
1
=
x
i
a
2
=
x
i
a
n
x
i
b
n
x
i
b
2
···
=
const.
(16)
for all rows: i
a
and i
b
(i
a
,i
b
=1
,
2
,
···
,m).
6 Contingency Matrix
The meaning of the above discussions will become much clearer when we view
a contingency table as a matrix.