Biomedical Engineering Reference
In-Depth Information
was to adapt the learning algorithm for perceptrons to the multilayered net-
work? This problem remained a major hinderance to using the MLP network
until the discovery of the backpropagation algorithm. Prior to this algorithm,
several attempts were made such as using stepwise functions (Reed and Marks,
1999), but these approaches were not successful.
Backpropagation is composed of two main steps: first, the derivatives of
the network training error are computed with respect to the weights through
clever application of the derivative chain rule. Second, a gradient descent
method is applied to adjust the weights using the error derivatives to minimize
the output errors. It can be applied to any feedforward neural network with
differentiable activation functions.
Briefly, let w ij denote the weight to node i from node j . Since this is a
feedforward network, no backward connections are permitted and w ji has no
meaning in the forward direction, that is, w ji = 0. If there is no connection
from node j , then w ij = 0. The network can be biased by adding a bias node
with constant activation, for example, y bias = 1. In the first step of the algo-
rithm, the output node values are computed by presenting a training pattern
x k to the inputs and evaluating each node beginning with the first hidden layer
and continuing to the final output node. This procedure can be performed by
a computer program that meticulously computes each node output in order.
When the outputs at each node are obtained the errors are computed, here
the sum of squared errors is used:
E s MLP = 1
2
k
c ki ) 2
( y ki
(3.7)
i
where k indexes the training patterns in the training set, i indexes the out-
put, nodes, y ki the desired output, and c ki the computed network output.
The scalar 1/2 is for convenience when computing the derivative of this error.
The error function E s MLP is the sum of individual squared errors of each train-
ing pattern presented to the network. This can be seen by writing
E s MLP =
k
E k
2
i
E k = 1
c ki ) 2
( y ki
Note that these quantities are independent of one another since they depend
on the training pattern being presented. When the errors are obtained, the
next step is to compute the derivative of the error with respect to the weights.
The derivative of E s MLP is
=
k
E s MLP
w ij
E k
w ij
(3.8)
Using the chain rule, the individual pattern derivatives can be written as
E k
w ij
=
p
E k
u p
u p
w ij
(3.9)
Search WWH ::




Custom Search