Neural Networks Approach - Computational Intelligence in Time Series Forecasting

Information Technology Reference

In-Depth Information

follows the strategy of minimization of the expected classification risk. The

strategy can be explained in terms of an n -dimensional input vector x belonging to

one of m possible classes with the probability density functions

px p x

( ),

( ),...,

p x .

( )

1

2

m

The architecture of a probabilistic network, shown in Figure 3.11, consists of an

input layer followed by three computational layers. It has a striking similarity with

a multilayer perceptron network. The network is capable of discriminating two

pattern categories represented through the positive and negative output signals. To

extend the network capability of multiplying discrimination, additional network

outputs and the corresponding number of summation units are required.

The input layer of a probabilistic network is simply a distribution layer that

provides the normalized input signal values to all classifying networks that make

up a multiple classes classifier. The subsequent layer consists of a number of

pattern units , fully connected to the input layer through adjustable weights that

correspond to the number of categories to be classified. Each pattern unit forms the

product of the input vector x with the weight vector w. The product value, before

being led to the corresponding summation unit , undergoes the initial nonlinear

operation

e V

(

xw

1)

i

2

Fxw

(

)

.

i

However, since both the input pattern and the weighting vectors are normalized

to the unit length, the last relation is to be rewritten as

n

¦

2

(

xw

)

j

ij

j

1

2

Fxw

(

)

e

2

V

.

i

The summation units finally add the signals coming from the pattern units

corresponding to the category selected for the current training pattern.

3.4 Network Training Methods

We now turn our attention to some training aspects of neural networks, particularly

to the aspects of training process acceleration and training process results. Our

primary interests are the supervised learning algorithms , the most frequently used

in real applications, such as the backpropagation training algorithm , also known

as the generalized delta rule .

The backpropagation algorithm was initially developed by Paul Werbos in

1971 but it remained almost unknown until it was “rediscovered” by Parker in

1982. The algorithm, however, became widely popular after being clearly

formulated by Rumelhart et al. (1986), which was a triggering moment for

Computational Intelligence in Time Series Forecasting

Search WWH ::

Custom Search

Home