Information Technology Reference
In-Depth Information
classifier for isolated digits that is employed when the block classifier cannot pro-
duce a confident decision. It uses the output of the block classifier for a neighboring
digit as contextual input.
Chapter 8. The second application deals with the binarization of matrix codes. Af-
ter the introduction to the problem, an adaptive thresholding algorithm is proposed
that is employed to produce outputs for undegraded images. A hierarchical recur-
rent network is trained to produce them even when the input images are degraded
with typical noise. The binarization performance of the trained network is evaluated
using a recognition system that reads the codes.
Chapter 9. The application of the proposed architecture to image reconstruction
problems is presented in Chapter 9. Super-resolution, the filling-in of occlusions,
and noise removal/contrast enhancement are learned by hierarchical recurrent net-
works. Images are degraded and networks are trained to reproduce the originals
iteratively. The same method is also applied to image sequences.
Chapter 10. The last application deals with a problem of human-computer inter-
action: face localization. A hierarchical recurrent network is trained on a database
of images that show persons in office environments. The task is to indicate the eye
positions by producing a blob for each eye. The network's performance is compared
to a hybrid localization system, proposed by the creators of the database.
Chapter 11. The thesis concludes with a discussion of the results and an outlook
for future work.
1.3 Contributions
The thesis attempts to overcome limitations of current computer vision systems by
proposing a hierarchical architecture for iterative image interpretation, investigating
unsupervised and supervised learning techniques for this architecture, and applying
it to several computer vision tasks.
The architecture is inspired by the ventral pathway of the human visual sys-
tem. It transforms images into a sequence of representations that are increasingly
abstract. With the level of abstraction, the spatial resolution of the representations
decreases, as the feature diversity and the invariance to transformation increase.
Simple processing elements interact through local recurrent connections. They
implement bottom-up analysis, top-down synthesis, and lateral operations, such as
grouping, competition, and associative memory. Horizontal and vertical feedback
loops provide context to resolve local ambiguities. In this way, the image interpre-
tation is refined iteratively.
Since the proposed architecture is a hierarchical recurrent neural network with
shared weights, machine learning techniques can be applied to it. An unsupervised
learning algorithm is proposed that yields a hierarchy of sparse features. It is ap-
plied to a dataset of handwritten digits. The extracted features are meaningful and
facilitate digit recognition.
Search WWH ::




Custom Search