Unsupervised Learning - Hierarchical Neural Networks for Image Interpretation

Information Technology Reference

In-Depth Information

S l with weight − 1 . The transfer functions of both projections

sampled feature sum

are linear.

In the following, the weight from the specific excitatory projection to the output

unit of feature kl is called E kl , and the weight from the inhibitory projection is called

I kl . Both gain factors determine, how specific a feature cell will react to stimuli. If

the excitation is large, compared to the inhibition, the cell reacts unspecifically to

many stimuli that partially match its excitatory projection. On the other hand, if

inhibition is large the cell is sharply tuned to the stimuli that exactly match the

specific weights of its excitatory projection.

5.2.2 Initialization

The weights w pq

kl of the excitatory projections are initialized unspecifically: Larger

positive weights are used in the center and weaker weights are used towards the

periphery of the receptive field window. The weights have a random component

and are normalized to a sum of one. This normalization of total excitatory weight

strength is maintained during learning. The excitatory weights are not allowed to

become negative.

The excitatory gain E kl is initialized to 2.0, while the inhibitory gain I kl is ini-

tialized to zero. Hence, initially the excitatory features will react very unspecific to

all stimuli present on the lower layer.

The bias weights of all projection units and output units are set to zero and not

changed during learning.

5.2.3 Hebbian Weight Update

A combination of winner-takes-all learning and Hebbian weight updates [91] is used

to make the excitatory weights specific. The idea is to change the weight template of

the locally most active feature cell such that it becomes more specific to the current

input. This means it will react more strongly to the same stimulus and react less

strongly to other stimuli.

For each training step, an image is chosen randomly from the dataset. It is loaded

into the input feature array at bottom layer of the pyramid, and the activities of all

feature cells are computed in the appropriate order.

The following learning rules are applied only at positions where the subsampled

feature sum S ( l− 1) of the inputs and the smoothed sum of the outputs S l are nonzero.

This avoids learning when there is nothing to learn.

Furthermore, the subsampled input sum S ( l− 1) must have at most two larger

values in its 8-neighborhood. This focuses the features to respond mostly to local

maxima and ridges of the input sum.

For the hypercolumns ( i,j ) of layer l meeting the above criteria, the most active

feature k max and the feature k sec with the second highest activity are determined.

The q th weight w pq

k max l of the excitatory projection p of the winning feature k max is

changed as follows:

Hierarchical Neural Networks for Image Interpretation

Search WWH ::

Custom Search

Home