Databases Reference
In-Depth Information
where the last two rows are represented by the first three columns. That is, the
rank of a matrix is equal to 3. Then, according to Theorem 13, the following
equations are obtained:
(5
k
53
−
k
52
−
4
k
51
)
×{
k
41
−
2
k
43
+(
k
51
−
2
k
53
−
1)
}
= 0
(25)
(5
k
43
−
k
42
−
4
k
41
)
×{
k
41
−
2
k
43
+(
k
51
−
2
k
53
−
1)
}
= 0
(26)
In case of
k
41
−
2
k
43
+(
k
51
−
2
k
53
−
1) = 0, simple calculations give several
equations for those coe
cients.
k
41
+
k
51
=2(
k
43
+
k
53
)+1
k
42
+
k
52
=
−
3(
k
43
+
k
53
)
The solutions of these two equations give examples of pseudo-statistical inde-
pendence.
9 Conclusion
In this chapter, a contingency table is interpreted from the viewpoint of gran-
ular computing and statistical independence. From the definition of statistical
independence, statistical independence in a contingency table will holds when
the equations of collinearity (14) are satisfied. In other words, statistical in-
dependence can be viewed as linear dependence. Then, the correspondence
between contingency table and matrix, gives the theorem where the rank of
the contingency matrix of a given contingency table is equal to 1 if two at-
tributes are statistical independent. That is, all the rows of contingency table
can be described by one row with the coe
cient given by a marginal distribu-
tion. If the rank is maximum, then two attributes are dependent. Otherwise,
some probabilistic structure can be found within attribute-value pairs in a
given attribute, which we call contextual independence. Moreover, from the
characteristics of statistical independence, a contingency table may be com-
posed of statistical independent and dependent parts, which we call pseudo-
statistical dependence. In such cases, if we merge several rows or columns,
then we will obtain a new contingency table with statistical independence,
whose rank of its corresponding matrix is equal to 1.0. Especially, we obtain
Diophatine equations for a pseudo-statistical dependence. Thus, matrix alge-
bra and elementary number theory are the key methods of the analysis of
a contingency table and the degree of independence, where its rank and the
structure of linear dependence as Diophatine equations play very important
roles in determining the nature of a given table.