Related Work - Hierarchical Neural Networks for Image Interpretation

Information Technology Reference

In-Depth Information

The success of Gabor filters [79] also shows that the intermediate representations

are interesting. These filters are localized in space at u and in frequency at ξ with

Gaussian envelopes g :

g u,ξ ( x ) = g ( x − u ) e iξx ;

g u,ξ ( ω ) = g ( ω − ξ ) e −iu ( ω−ξ ) .

Gabor filters resemble properties of V1 simple neurons in the human visual system

and are very useful for texture discrimination [231], for example.

3.1.2 Neural Networks

The hierarchical image representations discussed so far had very few, if any, param-

eters to adapt to a specific set of images. Neural networks with more free parameters

have been developed that produce representations which can be tuned to a dataset

by learning procedures. These representations need not to be invertible since they

are used, for instance, for classification of an object present in the image.

Neocognitron. One classical example of such adaptable hierarchical image repre-

sentations is the Neocognitron, proposed by Fukushima [77] for digit recognition.

The architecture of this network is illustrated in Figure 3.6. It consists of several

levels, each containing multiple cell planes. The resolution of the planes decreases

from the input towards the upper levels of the hierarchy. The cell planes consist of

identical feature detectors that analyze a receptive field located in the input.

The size of the receptive fields increases with height, as do the invariance to

small translations and the complexity of the features. The cells in the first level

of the network analyze only a small input region and extract edge features. Cells

located at the second level receive input from the edge features and extract lines and

corners. Increasingly complex features, such as digit parts, are extracted at the third

level. Feature detectors at the topmost level react to the entire image and represent

digit classes.

Fig. 3.6. The Neocognitron proposed by Fukushima [77]. Digit features of increasing com-

plexity are extracted in a hierarchical feed-forward neural network.

Search WWH ::

Custom Search

Home