Neural Networks Approach - Computational Intelligence in Time Series Forecasting

Information Technology Reference

In-Depth Information

Murata et al. (1994) used this generalization to determine the number of hidden

units required to mimic the system based on input-output examples only. Attention

was paid to avoiding possible network overfitting by taking a small number of

redundant hidden neurons. A large number of hidden layer neurons could, for the

given training example, deliver better learning results but, due to the increased

network complexity, for some fresh examples could deliver worse results.

What the interconnections of network nodes concerns, full interconnection is

recommended for initial network configuration, in which the output of each neuron

of a layer is connected with the input of each neuron of the subsequent layer.

However, in some applications, deviations from full interconnection have also been

successful.

For activation function selection , there is generally no rich choice left. For

backpropagation networks, mostly the

x sigmoid function

1

1e x

y

is selected as an activation function in numerous applications, including

time series forecasting. But in some applications the

x hyperbolic tangent function

ee

x

y

,

x

has also been used successfully, for instance when solving the problems

that rely on learning of deviations from average behaviour (Klimasauskas,

1991)

x step and ramp function are some additional alternatives favourable for

processing binary variables.

In any case, to avoid functional destruction of the neuron, the function selected

should be limited at its output, usually between the values -1 and +1. Although

there are no guidelines for selecting the activation functions in individual network

layers and for distributing them within the layers, it is still best to build

homogeneous individual layers and for the hidden neurons possibly to use the

sigmoid activation function . But still, some researchers have successfully used the

hyperbolic tangent as an activation function of hidden-layer neurons. Very seldom

heterogeneous network layers have been used. For time series forecasting, the

general experience has shown that for output neurons the linear activation function

delivers the best results. Some theoretical evidence for this has also been given

(Rumelhart et al. , 1986). It was shown that only for forecasting of time series with

trend, output neurons with a nonlinear activation function are required.

Search WWH ::

Custom Search

Home