Information Technology Reference
In-Depth Information
P
e
(
x
0
)
− P
e
(
x
0
− δ
)=
p
x
0
x
0
−δ
f
X|
1
(
t
)
dt − q
x
0
x
0
−δ
f
X|−
1
(
t
)
dt <
0
,
(4.13)
by condition (4.9). Using similar arguments,
P
e
(
x
0
)
P
e
(
x
0
+
δ
)
<
0.Thus
x
0
is a minimum of
P
e
(
x
). Now, suppose that (see Fig. 4.1b)
−
pf
X|
1
(
x
)
>qf
X|−
1
(
x
)
x
∈
[
x
0
−
δ, x
0
]
(4.14)
pf
X|
1
(
x
)
<qf
X|−
1
(
x
)
x
∈
[
x
0
,x
0
+
δ
]
.
(4.15)
Then
x
0
is a maximum of
P
e
(
x
). This can be proven as above or just by
noticing that this situation is precisely the same as above but with a relabeling
of the classes. For relabeled classes, the respective probability of error
P
(
r
)
(
x
)
e
is given by
F
(
r
)
X
1
(
x
)) +
qF
(
r
)
P
(
r
)
e
(
x
)=
p
(1
−
1
(
x
)=
|−
X
|
−
q
(1
F
X|−
1
(
x
)) +
pF
X|
1
(
x
)
=1
=1
−
−
P
e
(
x
)
.
(4.16)
Thus,
P
(
r
e
(
x
) is just a reflection of
P
e
(
x
) around 1
/
2, which means that
P
e
(
x
)
maxima are
P
(
r
e
(
x
) minima and vice-versa. The optimal split is chosen as
the minimum up to a relabeling.
Does the solution obtained by minimizing the entropy of the errors produced
byadatasplittercorrespondtothemin
P
e
solution for the problem? This
issue is discussed in the following sections where the error entropy (4.2) is
analyzed as a function of the data splitter parameter
x
,accordingtothe
following formulas
P
−
1
(
x
)=
q
(1
F
X|−
1
(
x
))
,
P
−
1
≡
−
P
1
(
x
)=
pF
X|
1
(
x
)
,
P
1
≡
(4.17)
P
1
=
qF
X|−
1
(
x
)+
p
(1
F
X|
1
(
x
))
.
1
−
P
−
1
−
−
4.1.1 Motivational Examples
Example 4.1.
Consider two uniform distributed classes with PDFs given by
1
1
f
X|−
1
(
x
)=
[
a,b
]
(
x
)
,
f
X|
1
(
x
)=
[
c,d
]
(
x
)
,
(4.18)
b
−
a
d
−
c
such that
a<c
(
ω
−
1
is at the left of
ω
1
). If the classes are separable, that is
a<b<c<d
,themin
P
e
solution is obtained if
x
∗
is set anywhere in ]
b, c
[.
The minimum entropy value
H
S
(
E
)=0also occurs in that interval because
P
(
E
=0)=1. So, minimizing Shannon's entropy of the error (SEE) leads in
this case to the min
P
e
solution.