Neural Identification of Controlled Dynamical Systems and Recurrent Networks - Neural Networks: Methodology and Applications

Information Technology Reference

In-Depth Information

w is the concatenation of the vector ( n, 1) of the vector parameter a and of

the scalar parameter b ; Y is a second-order real random variable.

We have

∇

J ( a ,b )=

−

E [( Y

−

b ) X , ( Y

−

b )] .

The data samples ( X 1 ,Y 1 ) ,..., ( X k ,Y k ) ,... are available on-line to solve

the estimation problem. They are independent. Then the stochastic gradient

approach may be used. The recursive stochastic gradient estimate is defined

by the following formula:

a k +1 = a k + γ k +1 ( Y k +1 −

X k +1 a k −

b k ) X k +1

b k +1 = b k + γ k +1 ( Y k +1 −

X k +1 a k −

b k )

We have the following convergence statement:

If the gain of the algorithm obeys the following conditions k =1 γ k =

•

∞

k =1 γk 2 <

, then the algorithm converges almost certainly to the linear

regression coe cients of Y with respect to X .

∞

The conditions on the gain that have just been stated are general. Here-

inafter, they will be referred to as the stochastic approximation conditions for

the gain. In particular, the sequence γ k =1 /k obeys those conditions.

4.3.3 Recursive Identification of an AR Model

Consider the identification problem of the AR( p )model

X ( k +1)= a 1 X ( k )+

···

+ a p X ( k

−

p +1)+ V ( k +1) .

We assume that the data are collected under a stationary regime, and we

are looking for a recursive estimate that minimizes the least square criterion

J ( w )= 1

p +1)) 2 ] .

2 E [( X ( k +1)

−

a 1 X ( k )

−···−

a p X ( k

−

The gradient of the cost function is: ∇J ( w )= −E{ [ X ( k +1) −a 1 X ( k ) −

···−a p X ( k−p +1)] · [ X ( k ; ... ; X ( k−p +1)] } Thus, the stochastic gradient

recursive estimate is defined by the algorithm

w ( k +1)= w ( k )+ γ k +1 ϑ ( k +1)[ X ( k ); ... ; X ( k

−

p +1)] ,

with ϑ ( k +1)= X ( k +1)

p +1).

This rule was encountered previously and has been long known as the delta

rule or Widrow rule. If the gain sequence obeys the stochastic approximation

conditions, the algorithm converges, so that the estimate is consistent.

In the case of AR models, the input-output data are no longer independent.

Therefore, the classical assumptions of the elementary law of large numbers

are not fulfilled. The following Markov linear model produces the data:

−

a 1 X ( k )

−···−

a p X ( k

−

Neural Networks: Methodology and Applications

Search WWH ::

Custom Search

Home