Information Technology Reference
In-Depth Information
Tabl e 4 . 1 Percentage of test error and standard deviations (in parentheses) ob-
tained with SEE for the simulated Gaussian data. Minimization was used for dis-
tances d =3 and d =1 . 5 , and maximization for d =1 .
n d
d Bayes error n t
200
2000
20000
10 2 6.79(2.41)
6.75(2.51)
6.75(2.51)
10 4 6.82(0.83)
3
6.68%
6.70(0.81)
6.66(0.81)
10 6 6.81(0.20)
6.69(0.08)
6.68(0.08)
10 2 25.23(4.65) 24.67(4.58) 22.61(4.21)
10 4 25.32(2.49) 24.72(2.15) 22.80(0.46)
10 6 25.46(2.54) 24.83(2.21) 22.82(0.24)
1.5
22.66 %
10 2 30.63(4.64) 30.90(4.48) 30.70(4.82)
10 4 30.93(0.47) 30.87(0.47) 30.84(0.46)
10 6 30.93(0.17) 30.86(0.14) 30.85(0.14)
1
30.85%
Tabl e 4 . 2 Percentage of test error and standard deviation (in parentheses) ob-
tained with maximization of SEE with increased h ( h =2 . 27 )for d =1 . 5 .
n d
d Bayes error n t 200 2000 20000
10 2 22.95(3.93) 22.78(4.01) 22.47(4.14)
10 4 22.73(0.41) 22.65(0.43) 22.66(0.41)
10 6 22.75(0.17) 22.67(0.14) 22.67(0.13)
1.5
22.66%
Table 4.1 shows the mean values and standard deviations over 1000 rep-
etitions for the test error of each experiment, using h =1 . 7, 0 . 1 and 0 . 8 for
d =1, 1 . 5 and 3, respectively. When the amount of available data is huge,
SEE achieves Bayes discrimination as expected. For small datasets the pic-
ture is quite different. SEE still finds a good solution for d =3and d =1,
but performs poorly for d =1 . 5, which is near the t value for Gaussian classes,
that is, in the limbo between a choice to minimize or maximize. The reason
lies in the highly non smooth estimate of the input distributions by the KDE
method, with h<h IMSE . This can be solved by using fat estimation of the
input PDFs as we did for d =1and d =3. As shown in Fig. 4.7, when h is
too small one gets a non-smooth entropy function while for large h the over-
smoothed input PDF estimates provide a smooth entropy curve preserving
the maximum. The results of Table 4.2 were obtained by using fat estimation
( h =2 . 27)tothecase d =1 . 5. Now, SEE performs similarly as in the cases
d =1and d =3.
The explanation for this behavior relies on the increased variance of the
estimated PDFs, which for a Gaussian kernel is given by σ = s 2 + h 2
(see
 
Search WWH ::




Custom Search