Information Technology Reference
In-Depth Information
and Rumelhart, 1988). It is a supervised learning algorithm, since both
input and output values are presented to the network. The BP algorithm
is also referred to as the generalized delta rule (Murtoniemi et al., 1994).
As previously stated, in the beginning of the training process, weights are
assigned initial, random small values. Then the network is trained
through an iterative process, where each iteration consists of two steps,
feed-forward and back-propagation. In the feed-forward step, the
training data are presented to the ANN model and network inputs are
used for prediction of outputs. In the BP step, it is necessary to calculate
error of the output fi rst, by comparing actual to predicted values. Based
on the obtained values of errors, weights are adjusted in order to make
predictions more accurate. Using the BP algorithm, the new weights of
the network output layer are calculated in the following way:
w o pq ( n + 1) = w o pq ( n ) + ηδ o
q y p + µ ( w o
pq ( n ) − w o pq ( n + 1))
[5.4]
whereas the new hidden layer weights correspond to:
w h pq ( n + 1) = w o pq ( n ) + ηδ o
h y p + µ ( w o
pq ( n ) − w o pq ( n + 1))
[5.5]
where n is the number of the iteration epoch, µ represents the momentum
factor (coeffi cient), and η is the learning rate. The learning rate η is an
adjustable parameter that controls the speed of the learning process (Sun
et al., 2003). When the learning rate is too high, then the weight values
alter dramatically in each iteration epoch and the output error varies
randomly without converging to an end point. However, if the learning
rate η is too low, then the overall training time can increase signifi cantly
(Freeman and Skapura, 1991), and the ANN model may get caught in
a local error minimum instead of the global minimum (Sun et al., 2003).
It is sometimes recommended to start the training process at a higher
speed and then gradually reduce the learning rate. Momentum coeffi cient
(µ) is used to avoid local minima and to reduce oscillation of the weight
change. The momentum coeffi cient determines the proportion of the last
weight change that is added into the new weight change (Sun et al.,
2003).
The term δ o
￿
￿
￿
q in Eq. 5.4 is the error term for the output layer neurons,
calculated according to the following expression:
δ o q = f ʹ( x o q )( t o q y o q )
[5.6]
where t o q is the training output value and y o
q is the output value predicted
by the network. In Eq. 5.5, the error term for the hidden layer neurons
δ o h is calculated as follows (Murtoniemi et al., 1994):
 
Search WWH ::




Custom Search