Introduction - Hierarchical Neural Networks for Image Interpretation

Information Technology Reference

In-Depth Information

structural

prototype

vectorization

classification

analysis

matching

"9"

pixel−image

class label

Fig. 1.10. Structural digit classification (image adapted from [21]). Information irrelevant for

classification is discarded in each step while the class information is preserved.

line−drawing

structural graph

feature−vector

analysis. It consists of a sequence of steps that transform one image representation

into another. Examples for such transformations are edge detection, feature extrac-

tion, segmentation, template matching, and classification. Through these transfor-

mations, the representations become more compact, more abstract, and more sym-

bolic. The individual steps are relatively small, but the nature of the representation

changes completely from one end of the chain, where images are represented as

two-dimensional signals to the other, where symbolic scene descriptions are used.

One example of a bottom-up system for image analysis is the structural digit

recognition system [21], illustrated in Figure 1.10. It transforms the pixel-image of

an isolated handwritten digit into a line-drawing, using a vectorization method. This

discards information about image contrast and the width of the lines. Using struc-

tural analysis, the line-drawing is transformed into an attributed structural graph

that represents the digit using components like curves and loops and their spatial

relations. Small components must be ignored and gaps must be closed in order to

capture the essential structure of a digit. This graph is matched against a database

of structural prototypes. The match selects a specialized classifier. Quantitative at-

tributes of the graph are compiled into a feature vector that is classified by a neural

network. It outputs the class label and a classification confidence. While such a sys-

tem does recognize most digits, it is necessary to reject a small fraction of the digits

to achieve reliable classification.

The top-down approach to image analysis works the opposite direction. It does

not start with the image, but with a database of object models. Hypotheses about the

instantiation of a model are expanded to a less abstract representation by account-

ing, for example, for the object position and pose. The match between an expanded

hypothesis and features extracted from the image is checked in order to verify or re-

ject the hypothesis. If it is rejected, the next hypothesis is generated. This method is

successful if good models of the objects potentially present in the images are avail-

able and verification can be done reliably. Furthermore, one must ensure that the

correct hypothesis is among the first ones that are generated. Top-down techniques

are used for image registration and for tracking of objects in image sequences. In

the latter case, the hypothesis can be generated by predictions which are based on

the analysis results from the preceding frames.

One example of top-down image analysis is the tracking system designed to

localize a mobile robot on a RoboCup soccer field [235], illustrated in Figure 1.11.

A model of the field walls is combined with a hypothesis about the robot position

and mapped to the image obtained from an omnidirectional camera. Perpendicular

to the walls, a transition between the field color (green) and the wall (white) is

Hierarchical Neural Networks for Image Interpretation

Search WWH ::

Custom Search

Home