Neural Networks Approach - Computational Intelligence in Time Series Forecasting

Information Technology Reference

In-Depth Information

AIC

Nk

ln(

V

2

)

2

K

,

where N is the number of training data, k is the number of output units of the

network, V is the maximum likelihood estimate of the mean square error for

training data and K is the number of model parameters.

The application principle of the AIC is that, if two models have the same mean

square error for a training data set, then the smaller sized model should be selected.

Alternatively, from a set of possible models, the model with the smallest value of

AIC is to be selected (Ishikawa and Moriyama, 1996; Anders and Korn, 1999).

This, however, requests a set of models to be built and their parameter estimated

before this application principle is used.

Unfortunately, direct application of the AIC to neural networks is rather

circumstantial. It is, however, facilitated when using the network information

criterion (NIC) of Stone (1977)

1

tr[

BA

1

]

NIC

ln

Lw

(

)

,

T

which is a generalization of the AIC. The first term in the above expression

represents the estimated maximum logarithmic likelihood. The matrices A and B

are defined as

AE

{

[ln]

[ln

L

t

BE

L

ln .

L

t

If the classes of models investigated include the true model, then it holds

asymptotically that A=B and

tr BA

[

1

]

tr I

[ ]

K

,

where K is, again, the number of model parameters. In this case the NIC takes the

form

1

K

NIC

ln

Lw

(

)

.

T

This is similar to the AIC, which in this transcription becomes

2

K

AIC

ln

Lw

(

)

.

T

Computational Intelligence in Time Series Forecasting

Search WWH ::

Custom Search

Home