Information Technology Reference
In-Depth Information
In addition to reverting the previous weight change in the case of ij < 0 the partial
derivative is also set to zero ( ∂E ij
∂w ij ( t )=0 ). This prevents changing of the sign of the
derivative once again in the succeeding step and thus a potential double punishment.
We use a nonlinear activation function with parameters recommended by LeCun et
al. [11] for all neurons in the network, as well as for the PB units (Eq. 2):
sigmoid( x )=1 . 7159 · tanh 2
3 · x .
(7)
2.2
Retrieval
The PB vector is usually low dimensional and resembles bifurcation parameters of a
nonlinear dynamical system, i.e. it characterizes fixed-point dynamics of the RNN.
During training the PB values are self-organized, thereby encoding each time series
and arranging it in PB space according to the properties of the training pattern. This
means that the values of similar sequences are clustered together, whereas more dis-
tinguishable ones are located further apart. Once learned, the PB values can be used
for the generation of the time series previously stored. For this purpose, the network
is operated in closed-loop mode. The PB values are 'clamped' to a previously learned
value and the forward pass of the network is executed from an initial input I (0) .In
the next time steps, the output at time t serves as an input at time t +1 . This leads
to a reconstruction of the training sequence with a very high accuracy (limited by the
convergence threshold used during learning).
2.3
Recognition
A previously stored (time) sequence can also be recognized via its corresponding PB
value. Therefore, the observed sequence is fed into the network without updating any
connection weights. Only the PB values are accumulated according to Eq. 1 and 2
using a constant learning rate γ this time. Once a stable PB vector is reached, it can be
compared to the one obtained during training.
2.4
Generalized Recognition and Generation
The network has substantial generalization potential. Not only previously stored se-
quences can be reconstructed and recognized. But, (time) sequences apart from the
stored patterns can be generated. Since only the PB values but not the synaptic weights
are updated in recognition mode, a stable PB value can also be assigned to an unknown
sequence.
For instance, training the network with two sine waves of different frequencies allows
cyclic functions with intermediate frequencies to be generated simply by operating the
network in generation mode and varying the PB values within the interval of the PB
values obtained during training. Furthermore, the PB values obtained during recognition
of a previously unseen sine function with an intermediate frequency (w.r.t. the training
sequences) will lie within the range of the PB values acquired during learning. Hence,
Search WWH ::




Custom Search