Unsupervised Learning - Hierarchical Neural Networks for Image Interpretation

Information Technology Reference

In-Depth Information

Numerical optimization methods, e.g. gradient descent or the fixed-point algorithm

called FastICA [106], are employed to estimate W .

Other Unsupervised Learning Techniques. Because the goals of unsupervised

learning can vary greatly, there exist many different unsupervised learning tech-

niques that have not been discussed so far.

One example is slow feature analysis (SFA), recently proposed by Wiskott and

Sejnowski [244]. This method focuses on finding representations that change only

slowly as input examples undergo a transformation. SFA expands the input sig-

nal non-linearly and applies PCA to this expanded signal and its time derivative.

The components with the lowest variance are selected as slow features. Tempo-

ral smoothing of the network's output is also the basis of the method proposed by

Foldiak [69] for the learning of invariant features.

Another example of unsupervised techniques is the learning of sparse features.

Sparse representations can be viewed as generalization to the local representations

generated by WTA networks. While in local representations exactly one unit is ac-

tive, in sparse representations multiple units can be active, but the ratio between

the active and the inactive units is low. This increases the representational power

of the code, facilitates generalization, allows for controlled inference, increases the

capacity of associative memories, implements fault tolerance, and allows for the

simultaneous representation of multiple items by superposition of individual encod-

ings [70]. There is substantial evidence that the human visual system utilizes sparse

coding to represent properties of visual scenes [215].

A simple local unsupervised algorithm for learning such representations in a

nonlinear neural network was proposed by Foldiak [68]. It uses Hebbian forward

connections to detect non-accidental features, an adaptive threshold to keep the ac-

tivity ratio low, and anti-Hebbian decorrelating lateral connections to keep redun-

dancy low. It produces codes with few active units for frequent patterns, while less

probable patterns are encoded using a higher number of active units.

Other algorithms for the learning of sparse features adjust connection weights

by explicitly maximizing measures of sparseness, successfully producing V1 sim-

ple cell-like features [170]. This class of algorithms is closely related to ICA since

sparse distributions are also non-Gaussian.

Beyond sparseness, another interesting property of a representation is the inter-

pretability of encodings. While a randomly chosen codeword could only signal the

presence of an item, Barlow [15] suggested that the cortex might use sparse codes

where the individual units signal the presence of meaningful features in the input.

In this scheme, items are encoded by combinations of features.

In the following section, I introduce an unsupervised learning algorithm for

the forward projections of the Neural Abstraction Pyramid. It is based on Hebbian

weight updates and lateral competition and yields a sequence of more and more ab-

stract representations. With increasing height, the spatial resolution of feature arrays

decreases, feature diversity increases and the representations become increasingly

sparse.

Hierarchical Neural Networks for Image Interpretation

Search WWH ::

Custom Search

Home