Databases Reference
In-Depth Information
as a matrix, the above discussion shows that the rank of the matrix is equal
to 1.0. Also, the results also show that partial statistical independence can be
observed.
The second important observation is that matrix algebra is a key point
of analysis of this table. A contingency table can be viewed as a matrix and
several operations and ideas of matrix theory are introduced into the analysis
of the contingency table.
The chapter is organized as follows: Section 2 discusses the characteristics
of contingency tables. Section 3 shows the conditions on statistical indepen-
dence for a 2
n table. Section 5 extends
these results into a multiway contingency table. Section 6 discusses statistical
independence from matrix theory. Sections 7 and 8 show pseudo-statistical
independence. Finally, Sect. 9 concludes this chapter.
×
2 table. Section 4 gives those for a 2
×
2 Contingency Table from Rough Sets
2.1 Rough Sets Notations
In the subsequent sections, the following notations is adopted, which is intro-
duced in [2]. Let U denote a nonempty, finite set called the universe and A
denote a nonempty, finite set of attributes, i.e., a : U
A ,where
V a is called the domain of a , respectively. Then, a decision table is defined as
an information system,
V a for a
A
=( U,A
∪{
D
}
), where
{
D
}
is a set of given decision
attributes. The atomic formulas over B
and V are expressions of
the form [ a = v ], called descriptors over B ,where a
A
∪{D}
V a .The
set F ( B,V ) of formulas over B is the least set containing all atomic formulas
over B and closed with respect to disjunction, conjunction and negation. For
each f
B and v
F ( B,V ), f A denote the meaning of f in A , i.e., the set of all objects
in U with property f , defined inductively as follows:
1. If f is of the form [ a = v ] then, f A =
{
s
U
|
a ( s )= v
}
2. ( f
g ) A = f A
g A ;( f
g ) A = f A
g A ;(
¬
f ) A = U
f a
By using this framework, classification accuracy and coverage, or true positive
rate is defined as follows.
Definition 1. Let R and D denote a formula in F ( B,V ) and a set of objects
whose decision attribute is given as D, respectively. Classification accuracy
and coverage(true positive rate) for R
D is defined as:
α R ( D )= |
R A
D
|
(= P ( D
|
R )) , and
|
R A |
κ R ( D )= |
R A
D
|
(= P ( R
|
D )) ,
|
D
|
 
Search WWH ::




Custom Search