Information Technology Reference
In-Depth Information
Layer 0 (28x28)
Layer 1 (14x14)
Layer 2 (7x7)
Layer 3 (1x1)
Output
Input
Fig. 9.12. Network architecture for filling-in occluded parts. It is an instance of the Neural
Abstraction Pyramid architecture. The activities of all feature arrays are shown after twelve
iterations along with the reconstruction target. The same network is also used for contrast
enhancement / noise reduction and for reconstruction from sequences of degraded images.
Target
overlapping 4 × 4 forward projections of Layer 1 features. Layer 2 features have
4 × 4 forward projections as well.
The forward and backward projections between Layer 2 and Layer 3 implement
a full connectivity with 7 × 7 × 8 × 16 total weights in each direction. Backward pro-
jections of Layer 1 and Layer 0 are non-overlapping. For each feature, 2 × 2 different
backward projections access the 1 × 1 hyper-neighborhood in the next higher layer.
Lateral projections in the first three layers originate in the 3 × 3 hyperneighbor-
hood of a feature cell. These layers are surrounded by a one pixel wide border. Its
activities are copied from feature cells using wrap-around. In the topmost Layer 3,
the lateral projections access all 16 feature cells.
While all projection units have linear transfer functions, a sigmoidal transfer
function f sig ( β = 1 , see Fig. 4.5(a) in Section 4.2.4) is used for the output units
of the processing elements. Training is done using a working set of increasing size
for twelve time steps using BPTT and RPROP. A low-activity prior for the hidden
features ensures the development of sparse representations.
9.3.3 Experimental Results
Figure 9.13 illustrates the reconstruction process for a test example after the network
was trained. One can observe that all features contribute to the computation. In the
first few time steps, a coarse approximation to the desired output is produced. The
hidden feature arrays contain representations of the image content that develop over
time. While the activity of the topmost Layer 3 decreases after a few iterations, the
representations in the other three layers approach a more interesting attractor. They
form a distributed hierarchical representation of the digit.
Fig. 9.14 shows the reconstruction process for the first ten digits of the test set.
One can observe that the images change mostly at occluded pixels. This demon-
strates that the network recognized the occluding square. Furthermore, the change
Search WWH ::




Custom Search