Information Technology Reference
In-Depth Information
0.6
w k
w 1
0.3
w 2
0
w 0
epochs
0
35
Fig. 3.16 Evolution of the weights in an experiment where the perceptron was
trained to solve the μ 1 =[10]
T
close-classes case, for bivariate Gaussian inputs.
3.3.2.2
Realistic Datasets
Comparison of theoretical and empirical MEE behaviors has to be restricted
to two-dimensional problems, in order to have viable graphical representa-
tions and the Nelder-Mead algorithm running in reasonable time. One may,
nonetheless, use dimensionally reduced real-world datasets. In what follows
we consider the plane of the first two principal components (denoted ( x 1 ,x 2 ))
of the original datasets when these have more than two features.
Since the true ( X 1 ,X 2 ) joint distributions are unknown, the true theo-
retical MEE solutions cannot be derived. One is still able, however, to derive
theoretical MEE solutions of very closely resembling problems, proceeding in
the following way: first, model the bivariate real-world PDFs by appropriate
distributions, such that they achieve the same covariance matrices and with
minimum L 1 distance of the marginal PDFs; next, apply to these modeled
PDFs the procedure outlined in Sect. 3.1.1 (numerical simulation).
PDF modeling of the marginal class-conditional distributions is achieved
by first obtaining from the data the Parzen window estimates, f X|t ,using
the optimal h IMSE bandwidth. Next, one proceeds to adjust adequate known
PDFs (namely, Gaussian, Gamma, and Weibull) by minimizing the L 1 dis-
tance between f X|t and its model. The L 1 distance is preferable to other
distance metrics (namely, L 2 ) by reasons described in [53]. Finally, for nu-
merical computation of the theoretical MEE one generates a large number of
points with the modelled class-conditional distributions and with the same
estimated covariance matrix.
In the work [219] the datasets of Table 3.2 were analyzed. The datasets
WDBC (30 features), Thyroid (5 features), and Wine (13 features) are from
[13]; PB12, a dataset with 2 features, is from [110]. For the first three datasets
the first two principal components were computed. These new datasets were
Search WWH ::




Custom Search