Error-Driven Task Learning - Computational Explorations in Cognitive Neuroscience

Information Technology Reference

In-Depth Information

GeneRec, CHL and other Algorithms

5.8

Biological Considerations for GeneRec

The CHL algorithm traces its roots to the mean field

(Peterson & Anderson, 1987) or deterministic Boltz-

mann machine (DBM) (Hinton, 1989b) learning al-

gorithms, which also use locally available activation

variables to perform error-driven learning in recurrently

connected networks. The DBM algorithm was derived

originally for networks called Boltzmann machines

that have noisy units whose activation states can be de-

scribed by a probability distribution known as the Boltz-

mann distribution (Ackley et al., 1985). In this prob-

abilistic framework, learning amounts to reducing the

distance between the two probability distributions that

arise in the minus and plus phases of settling in the net-

work.

The CHL/DBM algorithm has been derived from the

Boltzmann machine learning algorithm through the use

of approximations or restricted cases of the probabilistic

network (Hinton, 1989b; Peterson & Anderson, 1987),

and derived without the use of the Boltzmann distri-

bution by using the continuous Hopfield energy func-

tion (Movellan, 1990). However, all of these deriva-

tions require problematic assumptions or approxima-

tions, which led some to conclude that CHL was funda-

mentally flawed for deterministic (non-noisy) networks

(Galland, 1993; Galland & Hinton, 1990). Furthermore,

the use of the original (noisy) Boltzmann machine has

been limited by the extreme amounts of computation

required, requiring many runs of many cycles to obtain

the averaged probability estimates needed for learning.

Thus, the derivation of CHL directly from the

backpropagation algorithm for completely determinis-

tic (non-noisy) units (via GeneRec) restores some basis

for optimism in its ability to learn difficult problems.

Further, although the generic form of CHL/GeneRec

does have some remaining performance limitations,

these are largely eliminated by the use of this learning

rule in the context of a kWTA inhibition function and

in conjunction with the CPCA Hebbian learning rule.

Most of the problems with plain CHL/GeneRec can be

traced to the consequences of using purely error-driven

learning in a unconstrained bidirectionally connected

network (O'Reilly, 1996b, in press). We will explore

some of these issues in chapter 6.

We have seen that GeneRec can implement error back-

propagation using locally available activation variables,

making it more plausible that such a learning rule

could be employed by real neurons. Also, the use of

activation-based signals (as opposed to error or other

variables) increases plausibility because it is relatively

straightforward to map unit activation onto neural vari-

ables such as time-averaged membrane potential or

spiking rate (chapter 2). However, three main features

of the GeneRec algorithm could potentially be problem-

atic from a biological perspective: 1) weight symmetry,

2) the origin of plus and minus phase activation states,

and 3) the ability of these activation states to influence

synaptic modification according to the learning rule.

5.8.1

Weight Symmetry in the Cortex

Recall that the mathematical derivation of GeneRec de-

pends on symmetric weights for units to compute their

sending error contribution based on what they receive

back from other units. Three points address the biolog-

ical plausibility of the weight symmetry requirement in

GeneRec:

As mentioned above, a symmetry preserving learning

algorithm like the CHL version of GeneRec, when

combined with either soft weight bounding or small

amounts of weight decay, will automatically lead to

symmetric weights even if they did not start out that

way. Thus, if the brain is using something like CHL,

then as long as there is bidirectional connectivity ,

the weight values on these connections will naturally

take on symmetric values. The next two points ad-

dress this much weaker constraint of bidirectional

connectivity.

The biological evidence strongly suggests that the

cortex is bidirectionally connected at the level of

cortical areas (e.g., Felleman & Van Essen, 1991;

White, 1989a). The existence of this larger-scale

bidirectional connectivity suggests that the cortex

may have come under some kind of evolutionary

pressure to produce reciprocal bidirectional connec-

tivity — the use of such connectivity to perform

Computational Explorations in Cognitive Neuroscience

Search WWH ::

Custom Search

Home