Learning Iterative Image Reconstruction - Hierarchical Neural Networks for Image Interpretation

Information Technology Reference

In-Depth Information

Layer 0 (28x28)

Layer 1 (14x14)

Layer 2 (7x7)

Layer 3 (1x1)

Output

Input

Fig. 9.12. Network architecture for filling-in occluded parts. It is an instance of the Neural

Abstraction Pyramid architecture. The activities of all feature arrays are shown after twelve

iterations along with the reconstruction target. The same network is also used for contrast

enhancement / noise reduction and for reconstruction from sequences of degraded images.

Target

overlapping 4 × 4 forward projections of Layer 1 features. Layer 2 features have

4 × 4 forward projections as well.

The forward and backward projections between Layer 2 and Layer 3 implement

a full connectivity with 7 × 7 × 8 × 16 total weights in each direction. Backward pro-

jections of Layer 1 and Layer 0 are non-overlapping. For each feature, 2 × 2 different

backward projections access the 1 × 1 hyper-neighborhood in the next higher layer.

Lateral projections in the first three layers originate in the 3 × 3 hyperneighbor-

hood of a feature cell. These layers are surrounded by a one pixel wide border. Its

activities are copied from feature cells using wrap-around. In the topmost Layer 3,

the lateral projections access all 16 feature cells.

While all projection units have linear transfer functions, a sigmoidal transfer

function f sig ( β = 1 , see Fig. 4.5(a) in Section 4.2.4) is used for the output units

of the processing elements. Training is done using a working set of increasing size

for twelve time steps using BPTT and RPROP. A low-activity prior for the hidden

features ensures the development of sparse representations.

9.3.3 Experimental Results

Figure 9.13 illustrates the reconstruction process for a test example after the network

was trained. One can observe that all features contribute to the computation. In the

first few time steps, a coarse approximation to the desired output is produced. The

hidden feature arrays contain representations of the image content that develop over

time. While the activity of the topmost Layer 3 decreases after a few iterations, the

representations in the other three layers approach a more interesting attractor. They

form a distributed hierarchical representation of the digit.

Fig. 9.14 shows the reconstruction process for the first ten digits of the test set.

One can observe that the images change mostly at occluded pixels. This demon-

strates that the network recognized the occluding square. Furthermore, the change

Search WWH ::

Custom Search

Home