Information Technology Reference
In-Depth Information
Fig. 2.24. Outputs of two models with 4 hidden neurons: the model that has the
minimal TMSE, and the model that has the minimal virtual leave-one-out score
Fig. 2.25. Variation of the TMSE and of the virtual leave-one-out score as a function
of the number of hidden neurons
decreases when the number of hidden neurons increases, whereas the virtual
leave-one-out score seems to go through a minimum and subsequently to in-
crease. However, the choice between 2, 3 and 4 hidden neurons is not perfectly
clear, since the leave-one-out scores are not very different. The next section is
devoted to the problem of the choice of the most appropriate architecture.
For more than three hidden neurons, the TMSE becomes smaller than the
standard deviation of the noise; one can rightly conclude that models with
more than three hidden neurons tend to be overfitted. However, that is not a
practical selection criterion, since, in real applications, the standard deviation
of the noise is generally unknown.
2.6.4.2 Selection of the Best Architecture: Local Criteria (LOCL
Method)
In the previous section, a global criterion—the virtual leave-one-out score—
was used for finding the model that is least prone to overfitting, among models
having the same complexity. We have also shown that that criterion may not
be su cient for making a choice between models of different complexities. In
such a case, it is advantageous to use the local overfitting control via leverages
method (LOCL), based on the values of the leverages [Monari 1999; Monari
et al. 2002].
Search WWH ::




Custom Search