MEE with Continuous Errors - Minimum Error Entropy Classification

Information Technology Reference

In-Depth Information

named with a subscript '2' in Table 3.2 to distinguish them from the original

ones. Next, the respective bivariate PDF models were obtained and an ap-

propriate large number of instances generated maintaining the original class

proportions. The total number of instances, also mentioned in Table 3.2,

guaranteed IMSE < 0 . 01 of the estimated error PDFs. The empirical H S -

MEE solutions for the same datasets were computed, as well as the min P e

solutions with the Nelder-Mead algorithm.

For the multiclass datasets a sequential approach was followed, whereby

the final classification was the result of successive dichotomies.

Table 3.2 shows the training set error rates obtained with both, theoretical

and empirical, algorithms. They are in general close to the min P e values

(computed with the Nelder-Mead algorithm), the only exceptions being the

theoretical MEE error rates for Thyroid 2 and PB12. Further details on these

experiments are provided in the cited work [219].

Tabl e 3 . 2 Error rates for the empirical and theoretical MEE algorithms, together

with min Pe values, for four realistic datasets.

Dataset No. classes No. instances Empirical MEE Theoretical MEE min P e

Error Rate

WDBC 2

2

2390

0.0824

0.0890

0.0808

Thyroid 2

3

2509

0.0367

0.0458

0.0375

Wine 2

3

5000

0.0553

0.0546

0.0526

PB12

4

6000

0.1072

0.1410

0.1067

The datasets with the decision borders achieved by the three algorithms

are shown in Fig. 3.17. The decision borders are almost coincident, except

the theoretical MEE borders for the Thyroid 2 and PB12 datasets.

3.3.3 The Arctangent Perceptron

Analytical expressions of theoretical EEs derived by the application of The-

orem 3.2 can easily get quite involved, even for simple classifier settings.

Usually, closed-form algebraic expressions of the entropies are simply impos-

sible to obtain. A notable exception to this rule is Rényi's quadratic entropy

of the arctangent perceptron with independent Gaussian inputs, which we

study now. The arctangent perceptron (or arctan perceptron for short) is a

perceptron whose activation function is the arctangent function.

Lemma 3.1. The information potential of a two-class arctan perceptron

(atan (

·

) activation function) fed with independent Gaussian inputs having

mean

μ t and diagonal covariance matrix Σ t ,t denoting the class code, is

Minimum Error Entropy Classification

Search WWH ::

Custom Search

Home