Graphics Reference
In-Depth Information
(
a
ks
,
a
cj
)
(
a
ks
,
a
cj
)
Let
obs
represent the actual observed frequency of
in
S
.The
expression
q
2
(
obs
ks
−
exp
(
a
ks
,
a
cj
))
D
=
(4.51)
exp
(
a
ks
,
a
cj
)
j
=
1
summing over the outcomes of
C
in the contingency table, possesses an asymptotic
chi-squared propertywith
degrees of freedom.
D
can then be used in a criterion
for testing the statistical dependency between
a
ks
, and
C
at a presumed significant
level as described below. For this purpose, we define a mapping
(
q
−
1
)
1
2
,
if
D
>χ
(
q
−
1
)
;
h
k
(
a
ks
,
C
)
=
(4.52)
0
,
otherwise
.
2
where
is the tabulated chi-squared value. The subset of selected events
of
X
k
, which has statistical interdependency with
C
, is defined as
E
k
=
a
ks
|
χ
(
q
−
1
)
1
h
k
(
a
ks
,
C
)
=
(4.53)
We call
E
k
the covered event subset of
X
k
with respect to
C
. Likewise, the covered
event subset
E
c
of
C
with respect to
X
k
can be defined. After finding the covered
event subsets of
E
c
and
E
k
between a variable pair
(
,
X
k
)
, information measures
can be used to detect the statistical pattern of these subsets. An interdependence
redundancy measure between
X
k
and
C
k
can be defined as
C
X
k
,
C
k
I
(
)
X
k
,
C
k
R
(
)
=
(4.54)
X
k
,
(
C
k
)
H
X
k
,
C
k
X
k
,
C
k
where
I
(
)
is the expected MI and
H
(
)
is the Shannon's entropy defined
respectively on
X
k
and
C
k
:
P
(
a
cu
,
a
ks
)
X
k
,
C
k
I
(
)
=
P
(
a
cu
,
a
ks
)
log
(4.55)
P
(
a
cu
)
P
(
a
ks
)
E
k
E
c
a
ks
∈
a
cu
∈
and
X
k
,
C
k
H
(
)
=−
P
(
a
cu
,
a
ks
)
log
P
(
a
cu
,
a
ks
).
(4.56)
a
ks
∈
E
k
a
cu
∈
E
c
The interdependence redundancy measure has a chi-squared distribution:
2
df
χ
X
k
,
C
k
I
(
)
(4.57)
x
k
,
C
k
2
|
S
|
H
(
)