Information Technology Reference
In-Depth Information
P e ( x 0 ) − P e ( x 0 − δ )= p x 0
x 0 −δ
f X| 1 ( t ) dt − q x 0
x 0 −δ
f X|− 1 ( t ) dt < 0 ,
(4.13)
by condition (4.9). Using similar arguments, P e ( x 0 )
P e ( x 0 + δ ) < 0.Thus
x 0 is a minimum of P e ( x ). Now, suppose that (see Fig. 4.1b)
pf X| 1 ( x ) >qf X|− 1 ( x )
x
[ x 0
δ, x 0 ]
(4.14)
pf X| 1 ( x ) <qf X|− 1 ( x )
x
[ x 0 ,x 0 + δ ] .
(4.15)
Then x 0 is a maximum of P e ( x ). This can be proven as above or just by
noticing that this situation is precisely the same as above but with a relabeling
of the classes. For relabeled classes, the respective probability of error P ( r )
( x )
e
is given by
F ( r )
X
1 ( x )) + qF ( r )
P ( r )
e
( x )= p (1
1 ( x )=
|−
X
|
q (1
F X|− 1 ( x )) + pF X| 1 ( x ) =1
=1
P e ( x ) .
(4.16)
Thus, P ( r e ( x ) is just a reflection of P e ( x ) around 1 / 2, which means that P e ( x )
maxima are P ( r e ( x ) minima and vice-versa. The optimal split is chosen as
the minimum up to a relabeling.
Does the solution obtained by minimizing the entropy of the errors produced
byadatasplittercorrespondtothemin P e solution for the problem? This
issue is discussed in the following sections where the error entropy (4.2) is
analyzed as a function of the data splitter parameter x ,accordingtothe
following formulas
P 1 ( x )= q (1
F X|− 1 ( x )) ,
P 1
P 1 ( x )= pF X| 1 ( x ) ,
P 1
(4.17)
P 1 = qF X|− 1 ( x )+ p (1
F X| 1 ( x )) .
1
P 1
4.1.1 Motivational Examples
Example 4.1. Consider two uniform distributed classes with PDFs given by
1
1
f X|− 1 ( x )=
[ a,b ] ( x ) ,
f X| 1 ( x )=
[ c,d ] ( x ) ,
(4.18)
b
a
d
c
such that a<c ( ω 1 is at the left of ω 1 ). If the classes are separable, that is
a<b<c<d ,themin P e solution is obtained if x is set anywhere in ] b, c [.
The minimum entropy value H S ( E )=0also occurs in that interval because
P ( E =0)=1. So, minimizing Shannon's entropy of the error (SEE) leads in
this case to the min P e solution.
 
Search WWH ::




Custom Search