Information Technology Reference
In-Depth Information
intensive use of multilayer perceptron networks in many simulated engineering
applications. The real-life application had at that time to be “postponed” due to the
lack of a suitable neuro-technology. In the 1990s Rumelhart put much effort into
popularizing the training algorithm among the neural network scientific
community. Presently, the backpropagation algorithm is also used (in slightly
modified form) for training of other categories of neural networks.
In the following, we will confine our discussion mainly to multilayer
perceptron networks. As mentioned earlier, this kind of networks, based on given
training samples or input-output patterns, implements nonlinear mapping of
functions that is applicable to function approximation, pattern classification, signal
analysis, etc . In the process of training, the network learns through adaptation of
synaptic weights in such a way that the discrepancy between the given pattern and
the corresponding actual pattern at network output is minimized. Because the
synaptic adaptation mostly follows the gradient descent law of parameter tuning,
the backpropagation training algorithm is considered as the search algorithm of
unconstrained minimization of a suitably constructed error function at network
output.
In order to illustrate the basic concept of the backpropagation algorithm, let us
consider its application to the training of a single neuron located in the output layer
of a multilayer perceptron (see Figure 3.12). In addition, let us suppose that as the
nonlinear activation function the hyperbolic tangent function
1exp
J
u
j
yf
u
tanh(
J
u
)
(3.1)
j
j
1exp
J
u
j
is chosen, where
n
u
x T
,
J!
0.
(3.2)
¦
j
i
i
j
i
1
Furthermore, x i is the i th input with corresponding interconnecting weight w i to the
neuron and ș j is the bias input to the same neuron. Typically, all neurons in a
particular layer of the multilayer perceptron have the same activation function. The
aim of the learning algorithm is to minimize the instantaneous squared error
function of the network output
2
2
S
0.5
d
y
0.5
e
,
(3.3)
j
j
j
j
defined as the square of the difference (
dy between the desired output signal
and the actual output signal of the network, by modifying the synaptic weights
)
j
j
w
.
The minimization process in parameter tuning steps
' is based on the steepest
descent gradient rule
Search WWH ::




Custom Search