Databases Reference
In-Depth Information
contradictory view from the intuition that when two attributes has many
values, the dependence between these two attributes becomes low.
The key for understanding these conflicts is to consider the constraint on
the sample size.
In [3] we show that a constraint on the sample size of a contingency table
is very strong, which leads to the evaluation formula where the increase of
degree of granularity gives the decrease of dependency.
This paper confirms this constraint by using enumerative combinatorics.
The results show that sample size will restrict the nature of matrix in a
combinatorial way, which suggests that the dependency is closely related with
integer programming.
The paper is organized as follows: Section 2 shows preliminaries. Section
3 and 4 discusses the former results. Section 5 shows the effect of sample size
on a matrix (2
2) theoretically. Section 6 introduces empirical validation of
the results obtained in Sect. 5. Finally, Sect. 7 concludes this paper.
×
2 Preliminary Work
2.1 Notations
From Rough Sets
In the subsequent sections, the following notations is adopted, which is intro-
duced in [4]. Let U denote a nonempty, finite set called the universe and A
denote a nonempty, finite set of attributes, i.e., a : U
A ,where
V a is called the domain of a , respectively. Then, a decision table is defined as
an information system, A =( U,A
V a for a
∪{D}
), where
{D}
is a set of given decision
attributes. The atomic formulas over B
and V are expressions of
the form [ a = v ], called descriptors over B, where a
A
∪{D}
V a .The
set F ( B,V ) of formulas over B is the least set containing all atomic formulas
over B and closed with respect to disjunction, conjunction and negation. For
each f
B and v
F ( B,V ), f A denote the meaning of f in A , i.e., the set of all objects
in U with property f , defined inductively as follows.
1. If f is of the form [ a = v ] then, f A =
{
s
U
|
a ( s )= v
}
2. ( f
g ) A = f A
g A ;( f
g ) A = f A
g A ;(
¬
f ) A = U
f a
Contingency Matrix
Definition 1. Let R 1 and R 2 denote multinominal attributes in an attribute
space A which have m and n values. A contingency tables is a table of a
set of the meaning of the following formulas:
|
[ R 1 = A j ] A |
,
|
[ R 2 = B i ] A |
,
|
[ R 1 = A j
R 2 = B i ] A |
,
|
U
|
( i =1 , 2 , 3 ,
···
,n and j =1 , 2 , 3 ,
···
,m ) .
Search WWH ::




Custom Search