Information Technology Reference
In-Depth Information
Figure 11. Example Levenberg-Marquardt Neural Network training details
(Foresee & Hagan, 1997). Not only does this algorithm help eliminate overfitting of the target function,
it also provides an estimate of how many weights and biases are being effectively used by the network.
Larger networks should result in approximately the same performance since regularization results in a
trade off between error and network parameters, which is relatively independent of network size.
All neural network modeling and training is performed in MATLAB 7.0 and MATLAB's Neural
Network Toolbox (MathWorks, 2005d). An example of a Levenberg-Marquardt with Automated Bayes-
ian Regularization training session is presented in Figure 11 where we can see that the algorithm is
attempting to converge the network to a point of best generalization based on the current training set.
Even though this particular network has 256 weights, the algorithm is controlling the power of the
neural network at effective number of parameters of about 44. The network could further reduce the
error on the training set (Sum of Squared Error: SSE) since it could use all 256 weights. However, it
has determined that using more than the 44 weights will cause overfitting of the data and thus reduced
generalization performance.
Compared to the early stopping based on a cross-validation set, the Levenberg-Marquardt with
Automated Bayesian Regularization training algorithm is superior especially for small datasets since
separating out a cross-validation set it not required.
r ecurrent neural networks details
The recurrent neural network architecture is the same as the above-described feed-forward architecture,
except for one essential difference. There are recurrent network connections within the hidden layer
as presented in the subset architecture in Figure 12. This architecture is known as an Elman Network
Search WWH ::




Custom Search