Information Technology Reference
In-Depth Information
where t k is again the target value (not to be confused
with the event index t ), and o k is the actual output
activation, and both are implicitly functions of time
(event) t .
Equation 5.2 will be zero when the outputs exactly
match the targets for all events in the environment or
training set , and larger values will reflect worse perfor-
mance. The goal of task learning can thus be cast as that
of minimizing this error measure (also known as gradi-
ent descent in error). We refer to this as error-driven
learning. In this context, SSE (equation 5.2) serves as
an objective function for error-driven learning, in that
it specifies the objective of learning.
One standard and rather direct way to minimize any
function is to first take its derivative with respect to the
free parameters. The derivative gives you the slope of
the function, or how the function changes with changes
to the free parameters. For example:
a)
b)
. . .
. . .
Figure 5.4: Illustration of the credit assignment process,
where the activity of a unit is represented by how bright it
is. a) If the output unit was not very active and it should have
been more so (i.e., t k ￿ o k is positive), then the weights will
all increase, but in proportion to the activity of the sending
units (because the most active sending units can do the most
good). b) The same principle holds when the output unit was
too active.
for the individual output unit (t k ￿o k ) and the activation
of the sending unit s i . Thus, those sending units that are
more active when a big error is made will receive most
of the blame for this error. For example, if the output
unit was active and it shouldn't have been (i.e., (t k ￿o k )
is negative), then weights will be decreased from those
input units which were active. On the other hand, if
the output unit wasn't active when it should have been,
then the weights will increase from those input units
that were active. Thus, the next time around, the unit's
activation should be closer to the target value, and hence
the error will be reduced.
This process of adjusting weights in proportion to
the sending unit activations is called credit assignment
(though a more appropriate name might be blame as-
signment), illustrated in figure 5.4. Credit assignment
is perhaps the most important computational property
of error-driven learning rules (i.e., on a similar level as
correlational learning for Hebbian learning rules).
One can view the representations formed by error-
driven learning as the result of a multiple credit satis-
faction mechanism — an integration of the synergies
and conflicts of the credit assignment process on each
input-output pattern over the entire training set. Thus,
instead of reflecting the strongest correlations, as Heb-
bian learning does, the weights here reflect the strongest
solutions to the task at hand (i.e., those solutions that
satisfy the most input-output mappings).
The derivative of a network's error with respect to
its weights indicates how the error changes as the
weights change.
Once this derivative has been computed, the network's
weights can then be adjusted to minimize the network's
errors. The derivative thus provides the basis for our
learning rule. We will work through exactly how to take
this derivative in a moment, but first we will present the
results to provide a sense of what the resulting learning
rule looks and acts like.
Taking the negative of the derivative of SSE with re-
spect to the weights, we get a weight update or learning
rule called the delta rule :
(5.3)
where s i is the input (stimulus) unit activation, and ￿
is the learning rate as usual. This is also known as
least mean squares (LMS), and it has been around for
some time (Widrow & Hoff, 1960). Essentially the
same equation is used in the Rescorla-Wagner rule for
classical (Pavlovian) conditioning (Rescorla & Wagner,
1972).
It should make sense that this learning rule will adjust
the weights to reduce the error. Basically, it says that the
weights should change as a function of the local error
Search WWH ::




Custom Search