Information Technology Reference
In-Depth Information
a) Basal Ganglia
b) Frontal Cortex
Striasomes
Matrisomes
Ventromedial
Dorsal
SN
VTA
Figure 6.16: Possible arrangement of descending control
and ascending distribution of midbrain dopaminergic signals
in the basal ganglia and frontal cortex. a) Striosomes within
the basal ganglia may control the firing of the substantia nigra
(SN), which sends dopamine back to the entire basal ganglia.
b) Ventromedial parts of frontal cortex may control the ven-
tral tegmental area (VTA), which sends dopamine back to the
entire frontal cortex.
Figure 6.17: Firing of dopimanergic VTA neurons in an-
ticipation of reward. The top of the plot shows a histogram
of spikes, and the bottom shows spikes for individual trials
(each trial is a different row). The cue stimulus (instruction)
precedes the response trigger by a fixed amount of time, and
thus the instruction predicts the reward to be received after the
response (movement). VTA fires after the instruction, antici-
pating the reward. Reproduced from Schultz et al. (1993).
the action of dopamine is likely to modulate learning in
these areas, among other things. Thus, DA is considered
a neuromodulator . These midbrain areas provide a rel-
atively global learning signal to the brain areas (frontal
cortex and basal ganglia) relevant for planning and mo-
tor control. As we will see, the firing properties of these
neuromodulatory neurons are consistent with those of
the temporal differences learning rule.
Although these midbrain nuclei play the role of
broadcasting a global learning signal, other more “ad-
vanced” brain areas are required to control the firing of
this signal. As we will see, the key idea in reinforce-
ment learning is computing the anticipation of future
reward — that complex task is likely performed by ar-
eas of the frontal cortex and basal ganglia that project to
and control the midbrain dopaminergic nuclei. Neural
recording studies suggest that the basal ganglia neurons
are representing anticipated reward (Schultz, Apicella,
Romo, & Scarnati, 1995). Studies of patients with le-
sions to the ventromedial areas of the frontal cortex (and
related structures like the cingulate) suggest that these
areas are involved in predicting rewards and punish-
ments (Bechara, Tranel, Damasio, & Damasio, 1996).
Figure 6.16 shows a schematic of a possible rela-
tionship between the controlling areas, the midbrain
“broadcasters,” and the areas that are affected by the
dopamine signal. In the case of the basal ganglia sys-
tem (figure 6.16a), it is fairly well established that the
areas (called striosomes ) that have direct (monosynap-
tic) connections to the substantia nigra constitute a dis-
tinct subset of the basal ganglia (Gerfen, 1985; Gray-
biel, Ragsdale, & Mood Edley, 1979; Wilson, 1990).
Thus, although the dopamine signal coming from the
SN affects all of the basal ganglia, this signal may be
primarily controlled by only a specialized subset of this
structure. This notion of a distinct controller system
is an essential aspect of the TD learning framework,
where it is called the adaptive critic . It is also pos-
sible that a similar dissociation may exist in the frontal
cortex, where certain ventromedial areas play the role
of adaptive critic, controlling the dopamine signals for
the entire frontal cortex.
Thedataonthefiring properties of the VTA neurons
in simple conditioning tasks are particularly compelling
(Schultz et al., 1993). Figure 6.17 shows that the VTA
neurons learn to fire after the onset of a cue stimulus
(instruction) that reliably predicts a subsequent reward
(delivered after an arm movement is made in response
to a subsequent trigger stimulus). Figure 6.18 shows
that this anticipatory firing develops over learning, with
firing initially occurring just after the reward is actually
Search WWH ::




Custom Search