Databases Reference
In-Depth Information
Tabl e 1 .
Two way contingency table
R
1
=0
R
1
=1
R
2
=0
x
11
x
12
x
1
·
R
2
=1
x
21
x
22
x
2
·
x
·
1
x
·
2
x
··
(=
|U |
=
N
)
Tabl e 2 .
Contingency table (
m × n
)
A
1
A
2
···
A
n
Sum
B
1
x
11
x
12
···
x
1
n
x
1
·
B
2
x
21
x
22
···
x
2
n
x
2
·
.
.
.
.
.
.
.
.
B
m
x
m
1
x
m
2
··· x
mn
x
m·
Sum
x
·
1
x
·
2
···
x
·n
x
··
=
|U |
=
N
3 Statistical Independence in 2
×
2 Contingency Table
Let us consider a contingency table shown in Table 1. Statistical independence
between
R
1
and
R
2
gives:
P
([
R
1
=0]
,
[
R
2
= 0]) =
P
([
R
1
=0])
×
P
([
R
2
=0])
P
([
R
1
=0]
,
[
R
2
= 1]) =
P
([
R
1
=0])
×
P
([
R
2
=1])
P
([
R
1
=1]
,
[
R
2
= 0]) =
P
([
R
1
=1])
×
P
([
R
2
=0])
P
([
R
2
=1])
Since each probability is given as a ratio of each cell to
N
, the above equations
are calculated as:
P
([
R
1
=1]
,
[
R
2
= 1]) =
P
([
R
1
=1])
×
x
11
N
=
x
11
+
x
12
x
11
+
x
21
N
×
N
x
12
N
=
x
11
+
x
12
x
12
+
x
22
N
×
N
x
21
N
=
x
21
+
x
22
x
11
+
x
21
N
×
N
x
12
+
x
22
N
Since
N
=
i,j
x
ij
, the following formula will be obtained from these four
formulae.
x
22
N
=
x
21
+
x
22
×
N
x
11
x
22
=
x
12
x
21
or x
11
x
22
−
x
12
x
21
=0
Thus,
Theorem 1.
If two attributes in a contingency table shown in Table 1 are
statistical indepedent, the following equation holds:
x
11
x
22
− x
12
x
21
= 0
(1)