Neural Networks Approach - Computational Intelligence in Time Series Forecasting

Information Technology Reference

In-Depth Information

of network inputs. The geometric pyramid rule , on the other hand, suggests

assigning

N

D

N

u

N

,

h

i

o

hidden neurons to a single hidden layer, where N is the number of network

inputs, N the number of its outputs, and Į is multiplication factor the value of

which, depending on the complexity of the problem to be solved, should be

selected in the range 0.5 < Į <2. Baum and Haussler (1989) suggested the number

of neurons in the hidden layer be determined as

NE

u

tr

tol

N

d

,

h

NN

dp

o

where

is the number of training examples,

is the error tolerance,

is the

N

E

N

tr

tol

dp

number of data points per training example, and

N is the number of output

neurons.

Anyhow, the determination of the optimal number of hidden neurons involves

trial-and-error experimentation: starting with a number of neurons within the layer

to be decided - based on final accuracy of each learning process - to increase or

decrease the number of hidden neurons and to start a new learning process. In this

way the redundant hidden neurons can be deleted and the neurons needed for

optimal performance of the layer added. Here, both starting with a relatively large

or small number of neurons is possible, but starting with a large number of neurons

bears the risk of long-time computation and of getting trapped in local minima.

Khorasani and Weng (1994) have presented an approach to structural

adaptation of feedforward neural networks by neuron pruning, i.e . by addition and

deletion of hidden neurons based on the activity status of individual neurons during

the learning, measured by the variance of the neuron output signal and by the

strength of the backpropagated error. This is a proper indication of neuron activity

that helps decide which low-activity redundant neurons are to be deleted.

There is also a reliable way to determine the number of hidden neurons using

the Akaike's information criterion (AIC), originally defined as

AIC = (-2) ln(Maximum likelihood) + 2(number of adjusted parameters).

The criterion statistically evaluates the goodness of a model by combining the

evaluated mean squares error for training data and the number of parameters to be

estimated. Seen otherwise, AIC combines a measure of fit and the penalty term to

account for model complexity. Its potential application suitability for neural

networks model building was recognized by Kurita (1990) and Fogel (1991), who

reformulated the original form of the criterion (for statistically independent,

normally distributed output errors with zero mean and with constant variance) as

Computational Intelligence in Time Series Forecasting

Search WWH ::

Custom Search

Home