Information Technology Reference
In-Depth Information
0.5 to 10 with 0.5 steps, and the number of previous values L
}
was used. Table 6.8 only contains the best results for each network. Both
networks were evaluated using 2, 4 and 6 neurons.
We can see that while the average error value was around 4.5% for the
RTRL algorithm, the RTRL-ZED had errors around 1.5%. All results ob-
tained with RTRL-ZED are statistically significantly better (smaller error)
than those obtained with RTRL ( t -test p
∈{
8 , 10 , 12
0). It appears that only 2 neurons
are enough in both cases to learn the problem, but the RTRL-ZED seams to
benefit from an increase in number of neurons since the result for q =6had
an error of 1.4% which is smaller than the errors obtained with less neurons.
The second experiment is adapted from [2], and consists in predicting the
next symbol of the sequence: 01001000100001 ... ,up
to twenty zeros, always followed by a one. The number of symbols the network
needed to see in order to correctly make the remaining predictions until the
end of the sequence, was recorded. The sequence is composed of 230 symbols.
A hundred repetitions were made starting with random initialization of the
weights, with the learning rate, η , ranging from 5 to 39 for the standard
RTRL and from 2 to 9 for RTRL-ZED; the kernel bandwidth, h ,variedin
{
and the size of the sliding window for the temporal estimation of the
density, L ,in
1 , 2 , 3
}
.
The results are shown in Fig. 6.13. Each point in these figures represents
the percentage of convergence in 100 experiments versus the correspondent
average number of symbols (NS) necessary to learn the problem, for the
standard RTRL (star) and the RTRL-ZED (square) networks. The various
points were obtained by changing the parameters η , L ,and h (in the case of
standard RTRL only η is used). Only the cases where at least one of the 100
repetitions converged were plotted.
The figures show that standard RTRL is not able to obtain more than 40%
convergence, but the RTRL-ZED can reach 100% convergence. It can also be
observed that, in general, for a given value of NS, the RTRL-ZED is able to
obtain higher percentages of convergence than the original RTRL. A slight
advantage of the original RTRL over the new proposal is that it is able to
learn the problem with fewer symbols but by a small difference.
{
8 , 10 , 12
}
6.2.2 Long Short-Term Memory
Typical RNN implementations suffer from the problem of loosing error in-
formation pertaining to long time lapses. This occurs because the error sig-
nals tend to vanish over time [95]. One of the most promising machines for
sequence learning, addressing this information loss issue, is the Long Short-
Term Memory (LSTM) recurrent neural network [103, 82, 81, 171]. In fact,
it has been shown that LSTM outperforms traditional RNNs such as El-
man [63], Back-Propagation Through Time (BPTT) [242] and Real-Time
 
Search WWH ::




Custom Search