Graphics Reference
In-Depth Information
E c |−
E k |−
( |
)( |
)
where df is the corresponding degree of freedomhaving the value
.
A chi-squared test is then used to select interdependent variables in X at a presumed
significant level.
The cluster regrouping process uses an information measure to regroup data itera-
tively. Wong et al. have proposed an informationmeasure called normalized surprisal
(NS) to indicate significance of joint information. Using this measure, the informa-
tion conditioned by an observed event x k is weighted according to R
1
1
X k ,
C K
, their
measure of interdependency with the cluster label variable. Therefore, the higher the
interdependency of a conditioning event, the more relevant the event is. NS measures
the joint information of a hypothesized value based on the selected set of significant
components. It is defined as
(
)
x (
I
(
a cj |
a cj ))
x (
NS
(
a cj |
a cj )) =
m k = 1 R
)
(4.58)
X k ,
C k
(
x (
where I
is the summation of theweighted conditional information defined
on the incomplete probability distribution scheme as
(
a cj |
a cj ))
m
x (
X k ,
C k
I
(
a cj |
a cj )) =
R
(
)
I
(
a cj |
x k ))
k =
1
m
P
(
a cj |
x k )
X k ,
C k
a cu E c
=
R
(
)
log
(4.59)
P
(
a cu |
x k )
k = 1
In rendering a meaningful calculation in the incomplete probability scheme formu-
lation, x k is selected if
P
(
a cu |
x k )>
T
(4.60)
E c
a cu
where T
0 is a size threshold for meaningful estimation. NS can be used in a
decision rule in the regrouping process. Let C
={
a c 1 ,...,
a cq }
be the set of possible
cluster labels. We assign a cj to x e if
x (
x (
NS
(
a cj |
a cj )) =
min
a cu
NS
(
a cu |
a cu )).
C
If no component is selected with respect to all hypothesized cluster labels, or if
there is more than one label associated with the same minimum NS, then the sample
is assigned a dummy label, indicating that the estimated cluster label is still uncertain.
Also, zero probability may be encountered in the probability estimation, an unbiased
probability based on Entropy minimax . In the regrouping algorithm, the cluster label
for each sample is estimated iteratively until a stable set of label assignments is
attained.
 
 
Search WWH ::




Custom Search