Information Technology Reference
In-Depth Information
For a model that is not linear with respect to its parameters (e.g., a feed-
forward neural network, or a RBF network with adjustable centers and
widths), the optimization problem is multivariable nonlinear, which makes
ordinary least squares inapplicable. The techniques that solve such prob-
lems are described in detail in Chap. 2; those are iterative techniques
that make sequences of estimations of the parameters until a minimum is
reached, or a satisfaction criterion is met.
In the latter case, the optimization techniques are gradient methods; they
are based on the computation, at each iteration, of the gradient of the cost
function with respect to the parameters of the model. The gradient thus com-
puted is subsequently used for updating the values of the parameters found
at the previous iteration. Backpropagation is a popular, computationally eco-
nomical way of computing the gradient of the cost function (described in
Chap. 2). Therefore, backpropagation is not a training algorithm: it is simply
a technique for computing the gradient of the cost function, which is very fre-
quently an ingredient of neural network training. It has been often stated that
the invention of backpropagation made feedforward neural network training
possible; that is definitely not correct: methods for computing the gradient of
cost functions were used in signal processing long before the introduction of
neural networks. Such methods can be used for feedforward neural network
training [Marcos 1992].
Training algorithms have been tremendously improved during the past few
years. At the beginning of the 1990's, publications would frequently mention
tens or hundreds of thousands of iterations, requiring days of computing on
powerful computers. At present, typical trainings require tens or hundreds of
iterations. Figure 1.15 displays the training of a model with a single variable.
Crosses are the elements of the training set. Parameters are initialized to
“small” values (see Chap. 2 for the description of the initialization procedure),
so that the output of the network is essentially zero. The result obtained after
13 iterations is “visually” satisfactory; quantitatively, the TMSE and VMSE
(the points of the validation set are not shown) are of the same order of
magnitude, which is of the order of the standard deviation of the noise, so
that the model is appropriate.
1.2.2.5 Conclusion
In this section, we have explained how and why neural networks with su-
pervised training should be used. To summarize, neural networks are useful
whenever a nonlinear relation between numerical data is sought. Therefore,
neural networks are statistical tools for nonlinear regression. An overview of
the tasks implied in nonlinear model design was presented, together with con-
ditions for successful applications. In Chap. 2, the reader will find all necessary
details for neural network training, for input selection and for model selection,
both for static models (feedforward neural networks) and for dynamic model
(recurrent neural networks).
Search WWH ::




Custom Search