Introduction - Hierarchical Neural Networks for Image Interpretation

Information Technology Reference

In-Depth Information

Image Sequence:

World Model

Position Estimate

Sequence of Model Fits:

Fig. 1.11. Tracking of a mobile robot in a RoboCup soccer field (image adapted from [235]).

The image is obtained using an omnidirectional camera. Transitions from the field (green) to

the walls (white) are searched perpendicular to the model walls that have been mapped to the

image. Located transitions are transformed into local world coordinates and used to adapt the

model fit.

searched for. If it can be located, its coordinates are transformed into local world

coordinates and used to adapt the parameters of the model. The ball and other robots

can be tracked in a similar way. When using such a tracking scheme for the control

of a soccer playing robot, the initial position hypothesis must be obtained using a

bottom-up method. Furthermore, it must be constantly checked, whether the model

fits the data well enough; otherwise, the position must be initialized again. The

system is able to localize the robot in real time and to provide input of sufficient

quality for playing soccer.

While both top-down and bottom-up methods have their merits, the image inter-

pretation problem is far from being solved. One of the most problematic issues is the

segmentation/recognition dilemma. Frequently, it is not possible to segment objects

from the background without recognizing them. On the other hand, many recogni-

tion methods require object segmentation prior to feature extraction and classifica-

tion.

Another difficult problem is maintaining invariance to object transformations.

Many recognition methods require normalization of common variances, such as

position, size, and pose of an object. This requires reliable segmentation, without

which the normalization parameters cannot be estimated.

Processing segmented objects in isolation is problematic by itself. As the ex-

ample of contextual effects on letter perception in Figure 1.7 demonstrated, we are

able to recognize ambiguous objects by using their context. When taken out of the

context, recognition may not be possible at all.

1.1.4 Iterative Interpretation through Local Interactions in a Hierarchy

Since the performance of the human visual system by far exceeds that of current

computer vision systems, it may prove fruitful to follow design patterns of the hu-

Search WWH ::

Custom Search

Home