Neural Networks Approach - Computational Intelligence in Time Series Forecasting

Information Technology Reference

In-Depth Information

the network generalization capability. Therefore, care should be taken in selecting

the decay constant Ȝ , because an inappropriate value can deteriorate the

generalization capability of the weight decay process. As a remedy, Weigend et al.

(1991) recommend updating the Ȝ value on-line during the network training in

iterative steps.

Adding the penalty function in the weight decay and optimizing the augmented

performance index corresponds to the regularization method in which the penalty

term is added to the cost function to act as a restriction to the subsequent

optimization problem. In approximation theory, the added term penalizes the

curvature of the original solution, seeking for a smoother solution of the

optimization problem.

The regularization method is generally used to solve ill-posed problems . In the

theory of learning, the problems of learning smooth mappings from examples are

mostly ill-posed problems. For their solution Tikhonov (1963) proposed

optimization of the cost function I extended by a term J , which also represents a

cost function. Thus, the resulting cost function to be optimized becomes

I res = I + ȜJ ,

where Ȝ represents the regularization parameter , which determines the degree of

regularization in the sense of balancing the degree of smoothness of the solution

and its closeness to the training data. The regularization helps in stabilizing the

solution of the ill-posed problem because the added term, representing the penalty

to the original optimization problem, smoothens the cost function (Morozov,

1984).

The regularization approach determines the so-called Tikhonov functional

n

2

I

()

f

(

yf

( )

x

O

f

,

¦

res

i

1

the first term of which represents the closeness to the data, and in the second term f

is the input-output function, P is a linear differential constraint operator, and 2

is

a norm on the function space to which Pf belongs. This operator also embodies the

a priori knowledge about the problem solution.

To solve the regularization problem we proceed with the minimization of

extended cost function I res , using the resulting partial derivatives with respect to f in

order to build the Euler-Lagrange equation

1

n

ˆ

PPf

()

x

¦

(

y

f

()) (

x

G

x

),

i

O

i

1

P build the differential operator

in which the operator P and its adjoint operator

ˆ

PP Therefore, the above Euler-Lagrange equation is a partial difference equation.

Its solution can, therefore, be expressed as the integral transformation of the right-

.

Computational Intelligence in Time Series Forecasting

Search WWH ::

Custom Search

Home