Prediction of future events after learning Part 1 (The operation of memory (a single neuron can learn))

When an organism is faced with a new environment, it is compelled to adjust its behavior and adopt itself to altering characteristics. It memorizes an outer event, its impressions from the event, its own behavior and impression from the events that were a consequence of this behavior. When afterward similar circumstances will come, the memory will be recalled and will make easier overcoming of difficulties. This nontrivial process is called learning. Primitive forms of learning are feature of all living beings, even in bacteria and plants. A bacterium exhibits sensory adaptation in response to the continuous presence of a non-saturating stimulus, such as the alcohol [1131] and the roots of soybean or potato form giant cells when larvae of a nematode invades the root and reach the developing cylinder [132]. The larvae feed exclusively from the giant cells, developing in the plants under control of the worm. However, these are non-associative forms of learning.

Learning leads to formation of new memory, which is usually divided into two forms: procedural (implicit, motor or unconscious) memory and declarative (explicit or aware) memory. Motor learning may be completely unconscious, but aware learning always includes an unconscious component. A declarative memory is concerned with the ability to remember or recognize objects, such as sounds, smells, visual scenes, etc. Images of declarative memory can be retained in a latent state unattainable to awareness. A procedural memory connects with perceptual and motor procedures, leading to a goal. A goal may be conscious, while a peculiarity of the procedure is usually unconscious. Both forms of memory are usual for animals and humans, but procedural memory is easier to study in animal experiments, while declarative memory is easier to study in humans, because of their strong tendency to acquire information as conscious knowledge [86]. Therefore, procedural memory has been well investigated in animals but is poorly studied in humans. Investigation of declarative memory in animals is usually restricted to a study of long-term potentiation [20, 845] although this form of neuronal plasticity evidently is not an example of memory at all (we later will return to this problem), if we do not consider any trace of a past event (like a scar, for example), as a memory.


Basic types of learning at the neuronal level

Primitive animals, such as invertebrates and cold-blooded vertebrates have been likened to automatons machine-like creatures whose behavior appears motorized and stereotypical. Some forms of behavior in high animals are also automatic, and may be simple, such as palpitation of the heart and breathing or relatively complex, such as swimming.

Animal behavior looks goal-directed and even "conscious", because it is not based on an exhaustive programming of a sequence of actions controlled by an external supervisor providing all probable strategies of behavior depending upon possible states of the environment. An animal behaves as if a supervisor is within it having its own needs and goals. On the other hand, humans also have the capacity for a gradual trial-and-error learning that operates outside awareness. In special circumstances, after training, people do not understand why they make the actions they do. "It seems to be automatic. My mind just seemed to tell me, "just pick it up, it’s the right one" [91]. Automatic behavior in humans may be moderately complex, such as walking with the coordinated participation of many muscles, or extremely complex such as virtuoso piano playing. By the way, some forms of behavior cannot be acquired only with the help of conscious learning but without the participation of a procedural system.

The simplest forms of learning are habituation and sensitization and they are not reduced to fatigue and excitement, which are weakly selective phenomena. Habituation is a selective process. The neural system of any living being reacts powerfully to unexpected signals and attenuates its responsiveness to more frequent input, while its responsiveness to rarely delivered stimuli showed a marked average increase. The amplification of the response to rare stimuli required the presence of the other, more frequent stimulation source [1267, 371]. Response to a habitual signal recovers after an interruption in stimulation, but if this stimulus is repeated anew the decrease will be faster. Sensitization is, to some extent, an opposite process and responses to repeated painful influences are augmented [1242, 117]. Sensitization arises in response to frequent stimulation by a weak stimulus.

Associative learning is based on the pairing in time of indifferent and important events (unconditioned stimulus, US). Both the signal from an organism and from the environment (conditioned stimulus CS+) may be used in the capacity of the indifferent event. In experiments, similar to the CS+, other indifferent event, a discriminated stimulus (CS-) that is never associated with the US is used as a control of specificity of learning-induced changes. Sensory information affects future movements and the movement of a sensory organ (for example, whiskers of rats) scans the environment, thus detecting the object [952, 1385]. New information enters the brain through receptors grouping into several informational channels. The environment may exert a subtle effect to the body and indirectly supply information or may directly affect the state of the body through casual inputs (US). Causal events exert important actions to an organism and may be positive (food, water, narcotics, etc.) or negative (pain, inedible food, or to threat of damage and death). Correspondingly, learning induces either attractive change towards or deflective changes from some characteristic: animals avoid negative influences and aspire to attractive ones. There are two basic forms of associative learning, classical and instrumental (operant) conditioning. Classical conditioning is characterized by the contingent reinforcement of a specific signal (CS+), and instrumental conditioning by the contingent reinforcement of a specific type of behavior. During classical conditioning, the original response to the CS+ is modified and may acquire the features of the unconditioned response. Instrumental conditioning, as a rule, changes the frequency of appearance of same motor action, which may include either autonomic responses such as the heart rate and almost any prominent physiological parameter [841] or it may be completely conscious. It is necessary to remark that although for description of the current behavior we use the terms conditioned stimulus, discriminative stimulus, unconditional stimulus, habitual stimulus, output action, etc. they are not concrete signals entering an animal during life. An animal meets continuous signals and generates reactions, which continuously turn into a next action. Entering these events into an experiment is a way to make an investigation easier and this may be considered as a permissible approximation of a behavior. Classical and instrumental conditioning are models of natural behavior in an artificially simplified environment. This means that continuous signals turn into discrete ones and only one action is considered.

The next step in studying natural behavior is the consideration of the complex chains of conditioned reflexes that occur when an animal consecutively executes several, generally speaking, different instrumental reactions, in order to receive reinforcement at the end of the reaction chain. During instrumental learning, an animal searches for the true signal by means of the trial-and-error method, that is, it generates by chance of a few determined actions. One from them later turns out to be correct, while the others are erroneous.

When during complex learning animals use the trial-and-error method, generation of erroneous movements does not arise "by chance" and they does not decrease after learning in accordance with the exponential low. Rather, they appear suddenly at definite phases of learning and afterwards suddenly disappear. In the complex (although discrete) environment an animal tries to accomplish some simple variants of complex chains of reflexes, in order to reach the reinforcement. Such erroneous movements can more exactly be considered as "probing" movements of accidental choice from a set of preferable actions [1251]. Sometimes, mistakes during trial-and-error behavior happen because of a shortage of reliable information from the environment or due to failures in memory for information necessary to perform the task successfully. However, this is not always the case; animals sometimes make "probing" movements despite apparently being able to remember the appropriate information [1340].

In parallel with learning, neurons in various brain regions change their activity correspondingly to the learning procedure. When habituation or sensitization are observed at the levels of behavior, responses of neurons to corresponding stimuli also decrease or increase in various areas of neural system. Therefore, traces of indifferent signals are weak and their distribution is restricted. On the other hand, neuronal activity, concerned with causal, important inputs is strong and broadly distributed in many brain areas. Nevertheless, when during learning indifferent signals becomes CS+, their traces spread in the brain. Reorganization of neuronal activity was found in relation to any component of behavioral conditioned response: a preparatory increase in activity in response to the CS+, between responses to the CS+ and US, in response to the US, with an aftereffect and even change in the level of spontaneous activity. Usually, neurons re-order preexisting activities or, not often, generate activity anew. During the acquisition of classical conditioning, neural traces remaining after the CS+ interact with the neural traces of the US. Neural analysis of this interaction is relatively straightforward because the researcher controls the onset of both the CS+ and US. By contrast, during the acquisition of instrumental conditioning, a part of the neural system that generated the conditioned behavior has to interact with the neural traces that remain after the US. This makes neural analysis more difficult in that the researcher cannot directly know when the internal system initiates the generation of the conditioned behavior. CS+ appearance during instrumental behavior is not obligatory, but CS+ may be also an attribute of instrumental conditioning, for instance as the signal of the right time for instrumental reaction. This gives the investigator a modicum of control of cued operant conditioning. The CS+ initiates the beginning of the behavioral task, and gives a reference point for time indication. On the other hand, when a CS+ is introduced, instrumental reaction is the specific response to the CS+. This remains the modified response to the CS+ during classical conditioning.

We do not know whether the neural generation of the classical and instrumental reflexes has the same mechanism. In spite of a vast behavioral literature in comparative psychology concerning these forms of learning, we know very little of how they differ at the neural level. At the beginning of learning, the animal still does not have any information as to which form of learning will be presented. An animal immediately perceives that, for example, danger appears in the environment. Only during the training process can an animal determine which stimulus is a CS+ (that is, predicts appearance of the US), whether something depends on its own actions and which action is profitable. We tried to analyze how an animal determines regularities in the experimental procedure. Decision-making is a complex multi-step process. We sharply simplify our task if we take into account only the magnitude of action in the given moment and neglect the temporal structure of behavior.

Brain exploits both fast electrical activity (10-3-10-1 sec) and slow chemical signaling (10-2-103 sec). Therefore, mechanisms of higher neural functions are necessary an investigation of levels of chemical and electrical events during various forms of behavior. The most widely used object for investigations of neuronal mechanisms of behavior is the neural system of invertebrates (especially of mollusks) that posses a number of large identified neurons and a relatively simple neural system. The setup in Fig. 1.4. shows a design for experimental analysis of intracellular electrical activity during the execution of such neural functions as habituation, classical and instrumental conditioning. The roles of a specific metabolic system in neural function might be revealed by means of a chemical blockage or an augmentation of this systems. In the mollusk, classical conditioned reflexes [442, 712, 1170, 1258] and instrumental reflexes [261, 769, 881, 1270] have been described. During our own experiments, we received either classical conditioning, or the mollusk showed a modification of the probability of a specific neural pattern that occurs when it is contingently reinforced during instrumental actions [1260].

Schematic representation of elaboration of a local instrumental reflex in Helix after infusion of pharmacological drugs. The central nervous system in semi-intact Helix preparation is augmented (circle). The preparation is used for intra-cellular recording from central neurons (with the central nervous system intact and connected to the periphery). The mollusk receives whole-body tactile stimuli, CS+ and CS~, from an electrical stimulator ESi and ES3 by means of inductance and whole-body painful US from an electrical stimulator ES2. During operant conditioning the mollusk receives an US in those cases in which beforehand chosen trained neuron does not generate an instrumental action potential. Appearance of US did not depend on activity of simultaneously recorded control neuron. During classical conditioning, unconditioned stimulus during acquisition is presented each time after the CS+, independently of the response of either neuron. LPaG and RPaG, left and right parietal ganglion. Identified neurons LPa2, LPa3, RPa2 and RPa3 are shown by the points. Intracellular activity was augmented and recorded in digital mode. Drugs were administrated by means of extra - or intracellular microiontophoresis, or by means of perfusion.

Fig. 1.4. Schematic representation of elaboration of a local instrumental reflex in Helix after infusion of pharmacological drugs. The central nervous system in semi-intact Helix preparation is augmented (circle). The preparation is used for intra-cellular recording from central neurons (with the central nervous system intact and connected to the periphery). The mollusk receives whole-body tactile stimuli, CS+ and CS~, from an electrical stimulator ESi and ES3 by means of inductance and whole-body painful US from an electrical stimulator ES2. During operant conditioning the mollusk receives an US in those cases in which beforehand chosen trained neuron does not generate an instrumental action potential. Appearance of US did not depend on activity of simultaneously recorded control neuron. During classical conditioning, unconditioned stimulus during acquisition is presented each time after the CS+, independently of the response of either neuron. LPaG and RPaG, left and right parietal ganglion. Identified neurons LPa2, LPa3, RPa2 and RPa3 are shown by the points. Intracellular activity was augmented and recorded in digital mode. Drugs were administrated by means of extra – or intracellular microiontophoresis, or by means of perfusion.

In order to compare classical and instrumental conditioning, it is important to have data concerning the modification of responses to the CS+ during classical and cued instrumental conditioning under similar circumstances. We compared generation of the neuronal analogs of classical and instrumental conditioned of defensive reflexes in two related neurons in the defensive system of the snail Helix [1260]. Fig. 1.5 exhibits the representative example of neuronal activity during classical conditioning and reacquisition after extinction.

Schematic representation of elaboration of a local instrumental reflex in Helix after infusion of pharmalogical drugs. The central nervous system in semi-intact Helix preparation is augmented (circle). The preparation is used for intra-cellular recording from central neurons (with the central nervous system intact and connected to the periphery). The mollusk receives whole-body tactile stimuli, CS+ and CS~, from an electrical stimulator ESi and ES3 by means of inductance and whole-body painful US from an electrical stimulator ES2. During operant conditioning the mollusk receives an US in those cases in which beforehand chosen trained neuron does not generate an instrumental action potential. Appearance of US did not depend on activity of simultaneously recorded control neuron. During classical conditioning, unconditioned stimulus during acquisition is presented each time after the CS+, independently of the response of either neuron. LPaG and RPaG, left and right parietal ganglion. Identified neurons LPa2, LPa3, RPa2 and RPa3 are shown by the points. Intracellular activity was augmented and recorded in digital mode.

Fig. 1.5. Schematic representation of elaboration of a local instrumental reflex in Helix after infusion of pharmalogical drugs. The central nervous system in semi-intact Helix preparation is augmented (circle). The preparation is used for intra-cellular recording from central neurons (with the central nervous system intact and connected to the periphery). The mollusk receives whole-body tactile stimuli, CS+ and CS~, from an electrical stimulator ESi and ES3 by means of inductance and whole-body painful US from an electrical stimulator ES2. During operant conditioning the mollusk receives an US in those cases in which beforehand chosen trained neuron does not generate an instrumental action potential. Appearance of US did not depend on activity of simultaneously recorded control neuron. During classical conditioning, unconditioned stimulus during acquisition is presented each time after the CS+, independently of the response of either neuron. LPaG and RPaG, left and right parietal ganglion. Identified neurons LPa2, LPa3, RPa2 and RPa3 are shown by the points. Intracellular activity was augmented and recorded in digital mode.

The dynamics of neuronal responses to the CS+ and CS- in our experiments demonstrate that neuronal analogs of classical conditioning satisfy the basic properties known from behavioral experiments. Averaged data demonstrates that the conditioned response increased during acquisition of classical conditioning, decreased during extinction sessions and recovered rapidly during reacquisition (Fig. 1.6). This was not the case for responses to the CS-, which decreased to a steady level during acquisition, slightly decreased during extinction and almost did not change during reacquisition.

A short break in stimulation after the acquisition series led to augmentation of the responses to the CS+. Similarly, after extinction, although the USs were omitted, response to the CS+ partially recovered by itself after the break. This is an important general property of the classical conditioned reflex [947].

Neuronal activity during acquisition, extinction and reacquisition of classical conditioning in Helix neurons LPa2, LPa3, PPa2 and PPa3. Ordinate the number of APs (relative units) in the responses to the conditioned (closed symbols) and discriminated (open symbols) stimuli v.s. trial number. Medians and significance in the difference between responses to the CS+ and CS~ are shown at the top (Mann-Whitney U test, *P < 0.05; ** P < 0.01; < 0.001). The training procedure consisted of the acquisition (25-35 combinations of the CS+ and US), an extinction series (15-20 presentations of the isolated CS+ after 5-10 min break), and a second acquisition stage consisting of repeated development of the conditioned reflex following a 20-minute break. The development of the associative connection was judged by the change in the electrical activity of defense command neurons.

Fig. 1.6. Neuronal activity during acquisition, extinction and reacquisition of classical conditioning in Helix neurons LPa2, LPa3, PPa2 and PPa3. Ordinate the number of APs (relative units) in the responses to the conditioned (closed symbols) and discriminated (open symbols) stimuli v.s. trial number. Medians and significance in the difference between responses to the CS+ and CS~ are shown at the top (Mann-Whitney U test, *P < 0.05; ** P < 0.01; < 0.001). The training procedure consisted of the acquisition (25-35 combinations of the CS+ and US), an extinction series (15-20 presentations of the isolated CS+ after 5-10 min break), and a second acquisition stage consisting of repeated development of the conditioned reflex following a 20-minute break. The development of the associative connection was judged by the change in the electrical activity of defense command neurons.

Regularities for the instrumental conditioning were more complex. To ensure that the instrumental reaction occurred within the recorded neuron, the basic reinforcement schedule was delivered to only a single target neuron. Difference in the responses to the tactile stimuli CS+ and CS~ served as the neuronal indication for the quality of the neuronal model of instrumental conditioning. US was delivered to the snail only if the preliminary selected experimental (trained) neuron failed to fire an AP within 1.5-3 seconds after the CS+ (Fig. 1.7.). The appearance of a US did not depend on spike generation in the control neuron or on firing of any neuron to the CS~. During training, the animal learned to determine which neuronal discharge was essential for avoidance of punishment. In this way, it was clear which neuron was responsible for the instrumental reaction.

Dynamics of changes in the response to the CS+ during classical conditioning and during instrumental conditioning were rather different. At the beginning of classical conditioning, the animal, for a short period of time (around 7 pairings), collected data concerning the experimental procedure and, afterwards, response to the CS+ gradually increased to the steady state. At the beginning of instrumental conditioning, response of the trained neuron to the CS+ decreased, the animal received punishment and, after this, response of the trained neuron to the CS+ recovered and took on the role of the instrumental reaction (Fig. 1.7). Responses of the control neuron to the CS+ and CS- decreased after training (Fig. 1.8). Dynamics of responses to the CS+ and CS- in the control neuron were similar at the end of learning (Fig. 1.8). At the same time, responses to the CS+ and CS- in the trained neuron were significantly different (Fig. 1.9). Value of the trained neuron responses to the CS+ before and after training was approximately the same, but response to the CS- decreased. Difference between these responses demonstrates specificity of instrumental conditioning in respect to input. After learning, response of the trained neuron to the CS+ exceeded the response of the control neuron (Figs. 1.8 and 1.9) and this demonstrates specificity of instrumental conditioning in respect to output.

Intracellular recordings of neuronal responses during elaboration of a local instrumental conditioned reflex with the trained neuron RPa2 (top in each frame) and the control neuron LPa3 (bottom). At the left: the number of the CS+ is indicated for each exposure. At the right: the responses to the CS-. Calibrations are in the figure. Stimuli 11 and 14 produced incorrect responses and painful US were presented.

Fig. 1.7. Intracellular recordings of neuronal responses during elaboration of a local instrumental conditioned reflex with the trained neuron RPa2 (top in each frame) and the control neuron LPa3 (bottom). At the left: the number of the CS+ is indicated for each exposure. At the right: the responses to the CS-. Calibrations are in the figure. Stimuli 11 and 14 produced incorrect responses and painful US were presented.

Average data are presented in Figs. 1.8 and 1.9. The control neuron, as well as the trained one, exhibited a decreased response to the CS+ at the beginning of the training. In the middle part of the learning, approximately at the same period of time when response of the trained neuron failed, response of the control neuron increased (Fig. 1.8) and only after that it decreased.

Change in AP number during instrumental conditioning in the control neurons of mollusk Helix. Ordinate, AP number; abscissa, number of trials; medians and confidence intervals are shown. Trials 20-30 are indicated. During this interval of training, control neuron response to the CS+ overcomes the local maximum. Symbols are at the Figure.

Fig. 1.8. Change in AP number during instrumental conditioning in the control neurons of mollusk Helix. Ordinate, AP number; abscissa, number of trials; medians and confidence intervals are shown. Trials 20-30 are indicated. During this interval of training, control neuron response to the CS+ overcomes the local maximum. Symbols are at the Figure.

Comparing Figs.1.6 and 1.9 we see in both cases that the responses to the CS+ exceed those to the CS- and at first glance these responses seem similar. A selective increase in the conditioned response compared with responses to the CS- during training is well known. Nevertheless, the animal somehow distinguishes the logical difference between classical and instrumental conditioning. It "fears" the CS+ during classical conditioning and an increased response anticipates an emergence of the painful US. During instrumental conditioning, it learns to generate response to the CS+ in order to prevent an appearance the US [1270]. Therefore, the neuronal activity observed in our experiments may be considered as representative instrumental action of the entire mollusk.

Decision-making is not a continuous process, but it consist of, at least, three levels (but usually this is a more complex multi-step process). Firstly, there is choice of the dominant demand, secondly, there is choice of the channel of action in order to satisfy this demand, and thirdly, this is the method of action in the chosen channel. An animal evidently acts as if it hopes to achieve a specific result when it generates an instrumental reaction. This means that past experience allows the animal to predict probable future events in cases where properties of the environment are steady. Comparison of the cases in which the US appeared either after generation or after failure of the AP in the given neuron provides the essential criterion for this neuron’s participation in the instrumental reaction: absence of a difference in a neuron’s response to the CS+ preceding a US means that the neuron does not participate in an instrumental paradigm. An absence of the difference throughout the brain means that the paradigm is not instrumental at all.

Change in AP number during instrumental conditioning in the response to the CS+ and CS~ in the trained neurons of mollusk Helix. Ordinate, AP number; abscissa, number of trials; medians and confidence intervals are shown. Symbols are at the Figure.

Fig. 1.9. Change in AP number during instrumental conditioning in the response to the CS+ and CS~ in the trained neurons of mollusk Helix. Ordinate, AP number; abscissa, number of trials; medians and confidence intervals are shown. Symbols are at the Figure.

Throughout a learning session, the neural system consecutively acquired information as to which kind of learning was presented, whether a reaction of the neural system must be generated or inhibited and which instrumental reaction is correct. This process follows a multistep course and may occur at the single cell level. It may be possible for neurons to evaluate the significance of the difference between the appearance of simple events, such as numbers of trials ended or not ended by punishment, etc. During habituation and classical conditioning, neurons can perform such an evaluation by a selective modulation of their excitability [1258, 1259]. Nevertheless, we do not know whether each neuron evaluates the learning process separately, on the basis of its synaptic influx and transient regulation of the AP threshold, or whether this requires more neurons. As a first approximation, we may proceed from the hypothesis that one neuron is sufficient. However, one cannot discard the fact that neurons are also subjected to learning states of the whole neural system by all their chemical and electrical connections.

Next post:

Previous post: