Information Technology Reference
In-Depth Information
Let us now consider non-singleton sets,
B
i
.Wefirstnoticethat:
P
(
{
B
1
,B
2
}|
ω
)) =
P
(
B
1
|
ω
)+
P
(
B
2
|
ω
) whenever
B
1
∩
B
2
=
∅
.
(6.58)
Therefore, for any non-singleton
B
i
,wemaywrite:
P
(
B
i
|
1) =
b∈B
i
P
(
B
i
|
0) =
b∈B
i
P
(
b|
1);
P
(
b|
0)
.
(6.59)
Thus, once
P
10
and
P
01
have been computed for all
m
singleton sets, it is an
easy task to compute them for any non-singleton category set
B
i
,usingfor-
mulas (6.57) and (6.59), since for both theoretical and empirical probability
mass functions pertaining to
B
i
:
P
10
=
p
(1
−
P
(
B
i
|
1));
P
01
=
qP
(
B
i
|
0)
.
(6.60)
6.6.1.3
Error Entropy of Class Unions
Since the MEE approach is a two-class discrimination approach, one expects
to obtain performance improvements for datasets with
c>
3 classes by con-
sidering class unions, i.e., by including merged classes, say of
k
classes with
k
up to
in the set of candidate classes. The diculty thereof is that
the number of candidate classes may become quite high. There is, however,
a fast way of computing the dichotomous decision errors for unions of classes
as we shall now show.
Consider the class union
ω
=
ω
1
∪ ω
2
,
ω
1
∩ ω
2
=
∅
, and suppose that
we have computed the following three quantities for
ω
i
,
(
i
=1
,
2) and for a
decision rule
r
:
c/
2
1.
n
10
(
ω
i
, r
) — number of instances of class
i
that do not satisfy the rule;
2.
n
11
(
ω
i
,r
) — number of instances of class
i
that satisfy the rule;
3.
n
r
— number of instances (from whatever class) that satisfy the rule.
P
10
(
ω
i
) (i.e.,
P
10
for the
ω
i
vs
ω
i
decision) is, as before, simply
n
10
(
ω
i
, r
)
/n
.
Let us now see how one can compute
P
01
(
ω
i
) using
n
11
(
ω
i
,r
) instead of
n
01
(
ω
i
,r
).
First, notice that:
P
(
r
)=
P
(
ω
i
,r
)+
P
(
ω
i
,r
);
(6.61)
ω
i
)=
n
11
(
ω
i
,r
)
n
P
(
ω
i
,r
)=
P
(
ω
i
)
P
(
r
|
,
(6.62)
where
n
is the total number of instances at the node. From formulas (6.61)
and (6.62) we derive: