Databases Reference
In-Depth Information
Thus, the following theorem is obtained.
Theorem 10.
The fourth row represented by a linear combination of first to
third rows (basis) will satisfy the condition of statistical independence if and
only if ∆
(4
,j
)=0
.
Unfortunately, the condition is not simpler than Theorem 9. It is notable
∆
(4
,j
) = 0 is a diophatine equation whose trivial solution is
p
=
q
=
r
.That
is, the solution space includes not only
p
=
q
=
r
, but other solutions. Thus,
Corollary 1.
If p
=
q
=
r, then the fourth row satisfies the condition of
statistical independence.
The converse is not true.
Example 2.
Let us consider the following matrix:
⎛
⎝
⎞
⎠
1122
2233
4455
x
41
x
42
x
43
x
44
E
=
.
The question is when the fourth row represented by the other rows satisfies the
condition of statistical independence. Since
x
1
j
k
=1
x
2
k
−
x
2
j
k
=1
x
1
k
=
2,
x
1
j
k
=1
x
3
k
−
x
3
j
k
=1
x
1
k
=6and
x
2
j
k
=1
x
1
k
−
x
1
j
k
=1
x
2
k
=
−
−
4,
∆
(4
,j
) is equal to:
2
q
+10
r
.
Thus, the set of solutions is
{
(
p,q,r
)
|
10
r
=8
p
+2
q}
,where
p
=
q
=
r
is
included.
It is notable that the characteristics of solutions will be characterized by a
diophantine equation 10
r
=8
p
+2
q
and a contingency table given by a tripule
(
p,q,r
) may be represented by another tripule. For example, (3
,
3
,
3) gives the
same contingency table as (1
,
6
,
2):
⎛
⎝
−
2(
p
−
q
)+6(
r
−
p
)
−
4(
q
−
r
)=
−
8
p
−
⎞
⎠
1122
2233
4455
21
.
21
30
30
It will be our future work to investigate the general characteristics of the
solution space.
7.3 Contingency Table (4
×
4, Rank: 2)
When its rank is equal to 2, it can be assumed that the third and fourth rows
are represented by the first to third row:
(
x
41
x
42
x
43
x
44
)=
p
(
x
11
x
12
x
13
x
14
)
+
q
(
x
21
x
22
x
23
x
24
)
(20)
(
x
31
x
32
x
33
x
34
)=
r
(
x
11
x
12
x
13
x
14
)
+
s
(
x
21
x
22
x
23
x
24
)
(21)