Information Technology Reference
In-Depth Information
Fig. 2.19. Two models that have a large bias and a small variance
Fig. 2.20. Two models that have a small bias and a large variance
Unfortunately, bias and variance, just as the theoretical cost function,
cannot be computed. Thus, the solution to the di cult problem of model
selection is a tradeoff between two quantities that cannot be computed. The
di culty of the problem increases as the size of the training set decreases
[Gallinari 1999].
The models, trained from the same training set, among which a choice is
to be made, differ by two main characteristics:
their complexity: the complexity of a model can be defined as the number
of its elements (the number of monomials in a polynomial model, the
number of hidden neurons in a neural network), hence the number of its
adjustable parameters;
the vector of parameters for a given complexity: for models that are non-
linear with respect to the parameters, the cost function has several local
minima; therefore, for a given complexity and a given training set, differ-
ent trainings (with different initial values of the parameters) may provide
different models corresponding to different minima of the cost function.
Conversely, for models that are linear with respect to their parameters,
the least squares cost function has a single minimum: for a given com-
plexity and a given training set, there is a single vector of parameters for
which the cost function is minimum.
Hence, for a model that is not linear with respect to its parameters, the model
selection problem is actually twofold:
Search WWH ::




Custom Search