Introduction - Hierarchical Neural Networks for Image Interpretation

Information Technology Reference

In-Depth Information

(a) (b) (c)

Fig. 1.13. Iterative image interpretation: (a) the image is interpreted first at positions where

little ambiguity exists; (b) lateral interactions reduce ambiguity; (c) top-down expansion of

abstract representations bias the low-level decision.

these representations decreases, while the diversity of features and their invariance

to transformations increase.

Iterative Refinement. The proposed architecture consists of simple processing el-

ements that interact with their neighbors. These interactions implement bottom-up

operations, like feature extraction, top-down operations, like feature expansion, and

lateral operations, like feature grouping.

The main idea is to interpret images iteratively, as illustrated in Figure 1.13.

While images frequently contain parts that are ambiguous, most image parts can be

interpreted relatively easy in a bottom-up manner. This produces partial represen-

tations in higher layers that can be completed using lateral interactions. Top-down

expansion can now bias the interpretation of the ambiguous stimuli.

This iterative refinement is a flexible way to incorporate context information.

When the interpretation cannot be decided locally, the decision is deferred, until

further evidence arrives from the context.

Adaptability and Learning. While current computer vision systems usually con-

tain adaptable components, such as trainable classifiers, most steps of the processing

chain are designed manually. Depending on the application, different preprocessing

steps are applied and different features are extracted. This makes it difficult to adapt

a computer vision system for a new task.

Neural networks are tools that have been successfully applied to machine learn-

ing tasks. I propose to use simple processing elements to maintain the hierarchy

of representations. This yields a large hierarchical neural network with local recur-

rent connectivity for which unsupervised and supervised learning techniques can be

applied.

While the architecture is biased for image interpretation tasks, e.g. by utilizing

the 2D nature and hierarchical structure of images, it is still general enough to be

adapted for different tasks. In this way, manual design is replaced by learning from

a set of examples.

Search WWH ::

Custom Search

Home