Information Technology Reference
In-Depth Information
integrates all the available information coming from the past up to time
n
p +1, through p successive revisions. It can be trusted. Nevertheless,
it introduces both error and instability. It was shown in [Lion 2000] that
it is possible to control this approximation by introducing a projection
to provide reasonable boundaries to the results. Then, using stochastic
approximation theory, it is shown that the algorithm converges towards
a local minimum. (The minimum is local since the framework is not
linear and not necessarily convex).
Thus, it is important to discriminate, in the computation, two different times
indices, the learning step index n and the time-step index of the unfolded
network, which is denoted by k,k =1to p . A copy of the network is defined
by the two functions g and h that respectively determine the state and the
output of the network at step k as a function of the network state, its input
and its previous parameter values. We are now able to describe in detail the
operations that are necessary to compute the gradient using backpropagation
through time during the training step n + 1. All the current values of the
network parameters are stored in the parameter vector W .
For the n th learning step, the following components of the input vector be
used:
u k− 1
n +1 = u n−p + k ,k =1to p,
and the following output data:
ψ n +1 = ψ n−p + k +1 ,k =1to p.
If the state of the network cannot be measured (undirected learning), the
previous learning step state estimate is used as the initial state of the unfolded
network
x 0
n +1 = x n−p +1 = x 1
.
n
At the training step n + 1, the following computations are performed on
the unfolded network that was obtained at the previous step:
computation of the state and of the output for k =1to p,
x n +1 = g ( u k− 1
n +1 , x k− 1
n +1 , w )
y n +1 = h ( u k− 1
n +1 , x k− 1
n +1 , w );
comparison with the desired outputs for k =1to p,
ε n +1 = ψ n +1
y n +1 ;
computation of the adjoint unfolded network (the adjoint unfolded network
is built by inverting the signal propagation direction, by replacing nodes
by adders, adders by nodes, and activation functions by their derivatives)
Search WWH ::




Custom Search