MEE with Continuous Errors - Minimum Error Entropy Classification

Information Technology Reference

In-Depth Information

0.6

w k

w 1

0.3

w 2

0

w 0

epochs

0

35

Fig. 3.16 Evolution of the weights in an experiment where the perceptron was

trained to solve the μ 1 =[10]

T

close-classes case, for bivariate Gaussian inputs.

3.3.2.2

Realistic Datasets

Comparison of theoretical and empirical MEE behaviors has to be restricted

to two-dimensional problems, in order to have viable graphical representa-

tions and the Nelder-Mead algorithm running in reasonable time. One may,

nonetheless, use dimensionally reduced real-world datasets. In what follows

we consider the plane of the first two principal components (denoted ( x 1 ,x 2 ))

of the original datasets when these have more than two features.

Since the true ( X 1 ,X 2 ) joint distributions are unknown, the true theo-

retical MEE solutions cannot be derived. One is still able, however, to derive

theoretical MEE solutions of very closely resembling problems, proceeding in

the following way: first, model the bivariate real-world PDFs by appropriate

distributions, such that they achieve the same covariance matrices and with

minimum L 1 distance of the marginal PDFs; next, apply to these modeled

PDFs the procedure outlined in Sect. 3.1.1 (numerical simulation).

PDF modeling of the marginal class-conditional distributions is achieved

by first obtaining from the data the Parzen window estimates, f X|t ,using

the optimal h IMSE bandwidth. Next, one proceeds to adjust adequate known

PDFs (namely, Gaussian, Gamma, and Weibull) by minimizing the L 1 dis-

tance between f X|t and its model. The L 1 distance is preferable to other

distance metrics (namely, L 2 ) by reasons described in [53]. Finally, for nu-

merical computation of the theoretical MEE one generates a large number of

points with the modelled class-conditional distributions and with the same

estimated covariance matrix.

In the work [219] the datasets of Table 3.2 were analyzed. The datasets

WDBC (30 features), Thyroid (5 features), and Wine (13 features) are from

[13]; PB12, a dataset with 2 features, is from [110]. For the first three datasets

the first two principal components were computed. These new datasets were

Search WWH ::

Custom Search

Home