Neural computing in pharmaceutical products and process development - Computer-Aided Applications in Pharmaceutical Technology

Information Technology Reference

In-Depth Information

weight values is found, the network training stops and the squared error

between the actual (training) output values and those predicted by the

network is minimal at this point. At the beginning of the training process,

data are divided in two subsets, training and test. Training data are used

to search for optimal weight values, whereas by using the test data, the

network checks its predictive ability externally. The sum of squared

errors (SSE) for the training and test subsets can be calculated using the

following equation (Basheer and Hajmeer, 2000):

[5.3]

where o pi is the output predicted by the network of i th output node from

the p th sample; t pi is the training (actual) output of the i th output node

from the p th sample; N is the number of the training samples; and M is

the number of the output nodes.

In the process of network training, the progress of change of error for

training and test data sets are evaluated simultaneously. In the case of a

training data set, SSE decreases indefi nitely with the increasing number of

hidden nodes or training iterations (epochs) (Sun et al., 2003). Initially,

SSE for the training set decreases rapidly due to learning, whereas its

subsequent slower decrease is attributed to memorization or overfi tting

(if the number of training cycles or hidden nodes is too large). In the test

data set, SSE decreases initially, but subsequently increases due to

memorization and overfi tting of the ANN model (Sun et al., 2003). It is

therefore recommended to stop the training once the test error starts to

increase and to select the number of hidden nodes when the test error is

at a minimum (Sun et al., 2003).

Training data can be presented to the network model by example

(incremental training) or as the whole batch (batch training) (Wythoff,

1993). The weights are updated after processing each sample for the

incremental training process or after processing the entire training set for

the batch training process (Sun et al., 2003).

The training process can be set up to last until the predetermined error

value is found, or for a certain period of time. It is important to note that

in the case of neural networks, error is minimized in terms of the weights

of the network rather than in terms of the parameters, as is the case of

least-squares methods in conventional statistical approaches (Reis et al.,

2004).

One of the most commonly used training methods (i.e. weight

adjustment method) is the back-propagation (BP) algorithm (McClelland

Search WWH ::

Custom Search

Home