Introduction - Hierarchical Neural Networks for Image Interpretation

Information Technology Reference

In-Depth Information

classifier for isolated digits that is employed when the block classifier cannot pro-

duce a confident decision. It uses the output of the block classifier for a neighboring

digit as contextual input.

Chapter 8. The second application deals with the binarization of matrix codes. Af-

ter the introduction to the problem, an adaptive thresholding algorithm is proposed

that is employed to produce outputs for undegraded images. A hierarchical recur-

rent network is trained to produce them even when the input images are degraded

with typical noise. The binarization performance of the trained network is evaluated

using a recognition system that reads the codes.

Chapter 9. The application of the proposed architecture to image reconstruction

problems is presented in Chapter 9. Super-resolution, the filling-in of occlusions,

and noise removal/contrast enhancement are learned by hierarchical recurrent net-

works. Images are degraded and networks are trained to reproduce the originals

iteratively. The same method is also applied to image sequences.

Chapter 10. The last application deals with a problem of human-computer inter-

action: face localization. A hierarchical recurrent network is trained on a database

of images that show persons in office environments. The task is to indicate the eye

positions by producing a blob for each eye. The network's performance is compared

to a hybrid localization system, proposed by the creators of the database.

Chapter 11. The thesis concludes with a discussion of the results and an outlook

for future work.

1.3 Contributions

The thesis attempts to overcome limitations of current computer vision systems by

proposing a hierarchical architecture for iterative image interpretation, investigating

unsupervised and supervised learning techniques for this architecture, and applying

it to several computer vision tasks.

The architecture is inspired by the ventral pathway of the human visual sys-

tem. It transforms images into a sequence of representations that are increasingly

abstract. With the level of abstraction, the spatial resolution of the representations

decreases, as the feature diversity and the invariance to transformation increase.

Simple processing elements interact through local recurrent connections. They

implement bottom-up analysis, top-down synthesis, and lateral operations, such as

grouping, competition, and associative memory. Horizontal and vertical feedback

loops provide context to resolve local ambiguities. In this way, the image interpre-

tation is refined iteratively.

Since the proposed architecture is a hierarchical recurrent neural network with

shared weights, machine learning techniques can be applied to it. An unsupervised

learning algorithm is proposed that yields a hierarchy of sparse features. It is ap-

plied to a dataset of handwritten digits. The extracted features are meaningful and

facilitate digit recognition.

Search WWH ::

Custom Search

Home