Supervised Learning - Hierarchical Neural Networks for Image Interpretation

Information Technology Reference

In-Depth Information

groups [148, 109]. Memory traces of an input sequence reverberate in a randomly

connected neural network, and the states of this network are mapped by a feed-

forward network to the desired outputs.

Echo State Networks. The echo state approach to analyzing and training recur-

rent neural networks was proposed by Jaeger [109]. He uses discrete-time recurrent

networks with large numbers of inhomogeneous units. The units differ in type and

time-constants and have random connectivity where the magnitude of gains in loops

is smaller than one. Since the network dynamics has a contracting effect on the state,

the units implement fading memories. The effect of starting state differences van-

ishes as the network runs.

The state of the recurrent network can be viewed as a dynamic reservoir of

past inputs. This reservoir is accessed by linear read out units. Only the weights

of these units are trained to minimize a cost function by approximating desired

outputs. This assumes that the desired input-output mapping can be realized as a

function of fading memories. Furthermore, since the random recurrent connections

of the dynamic reservoir are not trained, it is assumed that the features needed to

compute the output will be among the many random features extracted by the units

of the reservoir.

Echo state networks have been applied to several non-trivial tasks. They include

periodic sequence generators, multistable switches, tunable frequency generators,

frequency measurement devices, controllers for nonlinear plants, long short-term

memories, dynamical pattern recognizers, and others.

For many of these tasks, feedback from output units to the reservoir was neces-

sary. Since, initially, the outputs do not resemble the desired outputs, the activity of

output units was clamped to the target values during training. For testing, the outputs

were clamped to the desired outputs during an initial phase. After this phase, the out-

puts ran free, and the test error was evaluated. When applying such a scheme, one

must take care, not to give the network, during the initial phase, information about

the outputs desired in the free running phase. Otherwise, the network can learn to

store the desired outputs in a delay-line and to replay them for testing.

Liquid State Machine. A similar approach was proposed by Maass et al. [148]. It

is called liquid state machine (LSM) since a large pool of randomly connected units

with contracting dynamics acts like a liquid that reverberates past stimuli. The units

used in LSM networks are biologically more realistic. Continuous time is modeled,

and the units emit spikes. Furthermore, dynamic synapses and transmission delays

resemble properties of biological neurons. The cells of the liquid are chosen as di-

verse as possible and connected randomly. Feed-forward output networks receive

inputs from all units of the liquid. Only these networks are trained to produce de-

sired outputs.

The main focus of the LSM approach is the analysis of the computational power

of such networks. It is shown that the inherent transient dynamics of the high-

dimensional dynamical system formed by a sufficiently large and heterogeneous

neural circuit may serve as a universal analog fading memory. Readout neurons can

learn to extract in real-time from the current state of such recurrent neural circuit

Hierarchical Neural Networks for Image Interpretation

Search WWH ::

Custom Search

Home