Information Technology Reference
In-Depth Information
Gradient Computation
The component of the gradient of the quadratic error with respect to a con-
nection weight of the recurrent network is the sum of computed values of
the components of the gradient with respect to all the connections of the
unwrapped network that share the same value of the connection weight.
That result was shown in Chap. 2 in the paragraph dedicated to the
“shared weights” technique.
The reader wishing to implement on a computer one of the foregoing algo-
rithms will find all the needed formulas in a synthetic framework in the Yacine
Oussar's Ph.D. entitled “Wavelet networks and neural networks for static and
dynamic process modeling” (Chap. 3, pages 64 to 69 for input-output models
and pages 72 to 81 for state models). This thesis is available in pdf format at
the following URL http://www.neurones.espci.fr. A full technical discussion
of the algorithms is developed there.
4.6.3 Real-Time Learning Algorithms for Recurrent Network
(RTRL)
The real time recurrent learning method (RTRL) relies on another approxima-
tion, different from time truncation. Let us write again the recurrent network
evolution equation from time n to time n + 1 under its canonical form,
x ( n +1)= g [ u ( n ) , x ( n ) , w ( n )]
y ( n +1)= h [ u ( n ) , x ( n ) , w ( n )] .
We want to compute, with the weights w ( n ), the gradient of the applica-
tion Ψ n +1
1 that takes w as input and delivers y = Ψ n + 1 ( w ). The computation
will be performed from an initial state x (0) by using the following sequence
of equations. For k =0 ,...,n ,
x ( k +1)= g [ u ( k ) , x ( k ) , w ]and y = h [ u ( n ) , x ( n ) , w ] .
Differentiating those expressions, we obtain
w Ψ n +1
[ w ( n )] =
w h [ u ( n ) , x ( n ) , w ( n )]
+ x h [ u ( n ) , x ( n ) , w ( n )] ·∇ w Φ 1 [ w ( n )] ,
1
where Φ 1 is defined as the application that takes w as input and delivers x =
Φ 1 ( w ) using the following recursive computation sequence: for k =0 ,...,n
1,
x ( k +1)= g [ u ( k ) , x ( k ) , w ]and x = x ( n ) .
w Φ 1 [ w ( n )] though the value w ( n )
was not available at the past time steps. Since we are operating in real time,
The problem is the computation of
Search WWH ::




Custom Search