Neural computing in pharmaceutical products and process development - Computer-Aided Applications in Pharmaceutical Technology

Information Technology Reference

In-Depth Information

and Rumelhart, 1988). It is a supervised learning algorithm, since both

input and output values are presented to the network. The BP algorithm

is also referred to as the generalized delta rule (Murtoniemi et al., 1994).

As previously stated, in the beginning of the training process, weights are

assigned initial, random small values. Then the network is trained

through an iterative process, where each iteration consists of two steps,

feed-forward and back-propagation. In the feed-forward step, the

training data are presented to the ANN model and network inputs are

used for prediction of outputs. In the BP step, it is necessary to calculate

error of the output fi rst, by comparing actual to predicted values. Based

on the obtained values of errors, weights are adjusted in order to make

predictions more accurate. Using the BP algorithm, the new weights of

the network output layer are calculated in the following way:

w o pq ( n + 1) = w o pq ( n ) + ηδ o

q y p + µ ( w o

pq ( n ) − w o pq ( n + 1))

[5.4]

whereas the new hidden layer weights correspond to:

w h pq ( n + 1) = w o pq ( n ) + ηδ o

h y p + µ ( w o

pq ( n ) − w o pq ( n + 1))

[5.5]

where n is the number of the iteration epoch, µ represents the momentum

factor (coeffi cient), and η is the learning rate. The learning rate η is an

adjustable parameter that controls the speed of the learning process (Sun

et al., 2003). When the learning rate is too high, then the weight values

alter dramatically in each iteration epoch and the output error varies

randomly without converging to an end point. However, if the learning

rate η is too low, then the overall training time can increase signifi cantly

(Freeman and Skapura, 1991), and the ANN model may get caught in

a local error minimum instead of the global minimum (Sun et al., 2003).

It is sometimes recommended to start the training process at a higher

speed and then gradually reduce the learning rate. Momentum coeffi cient

(µ) is used to avoid local minima and to reduce oscillation of the weight

change. The momentum coeffi cient determines the proportion of the last

weight change that is added into the new weight change (Sun et al.,

2003).

The term δ o

q in Eq. 5.4 is the error term for the output layer neurons,

calculated according to the following expression:

δ o q = f ʹ( x o q )( t o q − y o q )

[5.6]

where t o q is the training output value and y o

q is the output value predicted

by the network. In Eq. 5.5, the error term for the hidden layer neurons

δ o h is calculated as follows (Murtoniemi et al., 1994):

Computer-Aided Applications in Pharmaceutical Technology

Search WWH ::

Custom Search

Home