Databases Reference
In-Depth Information
Tabl e 1 . Two way contingency table
R 1 =0 R 1 =1
R 2 =0
x 11
x 12
x 1 ·
R 2 =1
x 21
x 22
x 2 ·
x · 1
x · 2
x ··
(= |U | = N )
Tabl e 2 . Contingency table ( m × n )
A 1
A 2
···
A n
Sum
B 1
x 11 x 12
···
x 1 n
x 1 ·
B 2
x 21 x 22
···
x 2 n
x 2 ·
.
.
.
.
.
. . .
B m x m 1 x m 2 ··· x mn
x
Sum x · 1
x · 2
···
x ·n x ·· = |U | = N
3 Statistical Independence in 2 × 2 Contingency Table
Let us consider a contingency table shown in Table 1. Statistical independence
between R 1 and R 2 gives:
P ([ R 1 =0] , [ R 2 = 0]) = P ([ R 1 =0])
×
P ([ R 2 =0])
P ([ R 1 =0] , [ R 2 = 1]) = P ([ R 1 =0])
×
P ([ R 2 =1])
P ([ R 1 =1] , [ R 2 = 0]) = P ([ R 1 =1])
×
P ([ R 2 =0])
P ([ R 2 =1])
Since each probability is given as a ratio of each cell to N , the above equations
are calculated as:
P ([ R 1 =1] , [ R 2 = 1]) = P ([ R 1 =1])
×
x 11
N = x 11 + x 12
x 11 + x 21
N
×
N
x 12
N = x 11 + x 12
x 12 + x 22
N
×
N
x 21
N = x 21 + x 22
x 11 + x 21
N
×
N
x 12 + x 22
N
Since N = i,j x ij , the following formula will be obtained from these four
formulae.
x 22
N = x 21 + x 22
×
N
x 11 x 22 = x 12 x 21 or x 11 x 22
x 12 x 21 =0
Thus,
Theorem 1. If two attributes in a contingency table shown in Table 1 are
statistical indepedent, the following equation holds:
x 11 x 22 − x 12 x 21 = 0
(1)
 
Search WWH ::




Custom Search