Neural Networks: An Overview - Neural Networks: Methodology and Applications

Information Technology Reference

In-Depth Information

Fig. 1.14. The most parsimonious neural network has the best generalization

abilities

where N V is the number of observations present in the validation set, and

where, for simplicity, y k denote the measurements of the quantity to be mod-

eled: y k = y p ( x k ). This relation is valid in the usual case of a model with a

single output; if the model has several outputs, the VMSE is the sum of the

mean square errors on each output.

This quantity should be compared with the mean square error on the

training set (TMSE),

N T

1

N T

g ( x k , w )] 2 ,

TMSE =

[ y k −

k =1

where N T is the number of observations present in the training set.

Consider the example shown on Fig. 1.14, and assume that the observa-

tions of the validation set are the midpoints between the observations of the

training set. Clearly, the TMSE of the second network is certainly smaller than

the TMSE of the first network, whereas the VMSE of the second network is

certainly larger than that of the first network. Therefore, if model selection

were performed on the basis of the training mean square error, overparame-

terized networks would systematically be favored, thereby leading to models

that exhibit overfitting.

Search WWH ::

Custom Search

Home