Information Technology Reference
In-Depth Information
(a)
(b)
(c)
Fig. 9.22. MNIST example sequences of degraded digits. A random background level, con-
trast reduction, occlusion, and pixel noise were combined: (a) low noise; (b) medium noise;
(c) high noise.
A random background level, contrast reduction, occlusion, and pixel noise are com-
bined to produce input sequences.
For each sequence, a random movement direction is selected for the occluding
square from 8 possible choices that form an 8-neighborhood. The square moves with
a speed of one step per iteration until it is reflected near an image border. Thus, most
digit parts are visible at some point of the sequence, and the network has the chance
to remember the parts seen before occlusion.
Three variants of the training set and the test set are produced with different
degrees of degradation. For the low-noise variant, the uniform pixel noise is cho-
sen from the interval [ 0 . 125 , 0 . 125] . The medium noise comes from the range
[ 0 . 25 , 0 . 25] , and the high noise from [ 0 . 5 , 0 . 5] . The background level is cho-
sen from [ 0 . 125 , 0 . 125] for the low-noise variant and from [ 0 . 25 , 0 . 25] for the
medium and high-noise variants. Each training set consists of 10,000 randomly se-
lected examples from the MNIST dataset.
Two examples of the low, the medium, and the high-noise sequences are shown
in Figure 9.22 next to the original digit. While the low-noise sequences are easily
readable for humans, the medium-noise sequences are challenging. The high-noise
sequences are so badly degraded that they are almost unreadable when looking at a
single image alone. Here, the need to integrate over multiple images of the sequence
becomes obvious.
The network was trained separately on the three noise variants. Training was
done as in the previous experiments, except this time 16 iterations were used, as
determined by the length of the input sequences.
9.5.2 Experimental Results
Figure 9.23 shows the reconstruction of a high-noise test digit performed by the
trained network. The activity of the entire network is shown over time. One can ob-
serve that all features contribute to the computation. The lines of the digits, visible
Search WWH ::




Custom Search