Information Technology Reference
In-Depth Information
ducing two correction factors that make the algorithm
more effective, while preserving the basic underlying
computation performed by it to stay true to its biologi-
cal and computational motivations. The first correction
factor is a way of renormalizing the weights by taking
into account the expected activity level over the send-
ing layer (which is typically sparse). This resolves the
dynamic range problem. The second correction factor
is a way of enhancing the contrast between weak and
strong weights (correlations) by applying a nonlinear
sigmoidal function to the weights. This resolves the se-
lectivity problem.
It is important to note how these correction factors fit
within our framework of mechanisms with a clear bi-
ological basis. As described in section 4.5, the CPCA
algorithm captures the essential aspects of model learn-
ing subserved by the biological mechanisms of LTP and
LTD. The correction factors introduced in this section
represent quantitative adjustments to this CPCA algo-
rithm (i.e., they affect only the magnitude, not the sign,
of weight changes) that retain the qualitative features of
the basic CPCA algorithm that are motivated by the bio-
logical and computational considerations. The resulting
algorithm performs efficient model learning.
Note that the explorations should make very clear the
effects of these correction factors. Thus, do not despair
if you do not fully understand this section — a rereading
of this section following the explorations should make
more sense.
values represent negative correlation, and larger values
represent positive correlation.
Renormalization simply restores the idea that a con-
ditional probability of .5 indicates a lack of correlation,
in effect renormalizing the weights to the standard 0-1
range. The best way to accomplish this renormaliza-
tion is simply to increase the upper-bound for weight
increases in an expanded form of the CPCA weight-
update equation:
￿ w
= ￿ [ y
￿ y
(4.17)
= ￿ [ y
(1 ￿ w
(1 ￿ x
)(0 ￿ w
(you can use simple algebra to verify that the second
form of the equation is equivalent to the first).
This expanded form can be understood by analogy to
the membrane potential update equation from chapter 2.
The y j x i term is like an excitatory “conductance” that
drives the weights upward toward a “reversal potential”
of 1, while the y j (1 ￿ x i ) term is an inhibitory “con-
ductance” that drives the weights downward toward a
“reversal potential” of 0. Thus, to correct for the sparse
activations, we can make this “excitatory reversal po-
tential” greater than 1, which will increase the range of
the weight values produced:
(4.18)
where m>1 is the new maximum weight value for the
purposes of learning. The weights are still clipped to the
standard 0-1 range, resulting in a potential loss of res-
olution above some value of the conditional probabil-
ity — this rarely happens in practice, however. Equa-
tion 4.18 preserves a linear relationship between the
true underlying conditional probability and the equilib-
rium weight value, which is very important. 2 We s e t t h e
correction factor m using the following equation:
4.7.1
Renormalization
When the sending layer has a low expected activity
level, any given sending unit is not very active on av-
erage. Thus, when we consider the conditional prob-
ability computed by the CPCA learning rule (equa-
tion 4.11), we would expect a similarly low average
probability of a given sending unit x i being active given
that the receiver is active. Indeed, if there is really no
correlation between the activity of the sender and the re-
ceiver, then we expect the conditional probability to be
around ￿ ,where ￿ is the expected activity level of the
sending layer (typically between .10 and .25 in our sim-
ulations). This violates the notion that a probability of
.5 should represent a lack of correlation, while smaller
(4.19)
m =
2 Another obvious way of compensating for expected activity lev-
els is to multiply the increase and decrease terms by appropriate com-
plementary factors. However, doing so produces a convex relation-
ship between the weight and the actual conditional probability, which
distorts learning by making things appear more correlated than they
actually are.
Search WWH ::




Custom Search