Information Technology Reference
In-Depth Information
0.8
1
0.9
H S (x')
H S (x')
H S (x')
x'
x'
x'
0.5
0.6
0.7
−2
3
−2
3
−2
3
(a) h =0 . 2
(b) h =0 . 498
(c) h =2 . 27
Fig. 4.7 Error entropy for different values of h in the Gaussian distribution ex-
ample of Sect. 4.1.3.1 with d =1 . 5 . The location of the optimal solution is marked
with a vertical line.
formula (E.5)). In practice, the increased h has the effect of approximating
the classes (recall the discussion at the end of the previous section) turning
the minimization problem into a maximization one. Of course, one should
not increase h indefinitely, because the optimization algorithm would make
gross mistakes when dealing with an almost flat entropy curve. A discussion
on this issue as well as additional experiments with real world datasets can
be found in [212, 216].
4.1.3.2
Resubstitution Estimates
The kernel-based estimation method is not an attractive method to be used
in practice due to its computational burden. It is far simpler to compute
the empirical estimate of (4.2) using the usual resubstitution estimation of
the probabilities. For notational simplicity sake we will use in this and the
following sections 0-1 class labeling ( T =
{
0 , 1
}
), and write the empirical
estimate of (4.2) as
SEE ( P 01 , P 10 )=
P 01 ln P 01
P 10 ln P 10
SEE
P 01
P 10 )ln(1
P 01
P 10 )
(1
(4.46)
with
P 10
P ( E =1)= P ( T =1 ,Y =0) , the error probability of class ω k ,
P 01
P ( E =
P ( T =0 ,Y =1) , the error probability of class ω k ,
1) =
where ω k , k =1 ,...,c , is our class of interest in some two-class discrimination
problem, which we want to discriminate from its complement ω k .
The resubstitution estimate of SEE uses the error rate estimates
P 10 =
n 10 /n, P 01 = n 01 /n with n tt
denoting the number of class t cases classified
as t .
Search WWH ::




Custom Search