Prediction of future events after learning Part 2 (The operation of memory (a single neuron can learn))

How does a neuron reveal, which type of learning tasks it has encountered?

At the very beginning of training, neurons demonstrated similar dynamics of cellular responses for both forms of conditioning, classical and instrumental: an absence of learning-related changes in the responses to the CS+. Further, the neural system discloses the particularities of the experimental procedure so that it begins generate the correct reaction. We tried to trace this process through various learning sessions (Fig. 1.10). A slow down during several trials before the start of changes in CS+ response may be connected with recognition of the regularities of the learning process. The data presented in Fig. 1.10 describe how information related to the experimental procedure was acquired by neurons during training. The significance of the differences, as presented in Fig. 1.10, reflects only statistical evaluation of information that has been received by the neurons. We do not know what level of significance is sufficient for a neuron’s decision-making. Nevertheless, these evaluations are rather essential. At the beginning of training, a whole brain and each neuron of the brain does not have at its disposal any information regarding the form of current training. This information does not exist if the neuron cannot take into consideration the researcher’s intention. As far as the training develops, this information appears bit by bit and we may observe how neurons change their behavior. This process is illustrated in Fig. 1.10.


During an experimental procedure that corresponds to classical conditioning, the CS+ precedes the US significantly more often than the CS- precedes the US. There was a statistical significance to the difference in the numbers of times the US was given after the CS+ or CS- by a given stage of classical conditioning. An animal acquires knowledge about which tactile stimulus would more frequently precede the US (Fig. 1.10A, rhombi). This became significant only after 5-7 combinations of the tactile stimulus and the US. At this time, appearance of the US after the CS+ still did not depend on generation or failure of an AP in response to the CS+ (Fig. 1.10A, squares), corresponding to a classical procedure. Increase in the response to the CS+ began only after around 7 combinations of CS+ and US (Fig. 1.6).

In our experiments, participation of a trained neuron in the instrumental reaction was predetermined by the conditions of the experiment, while the statistical significance of its participation in an instrumental reaction increased with the data accumulation (Fig. 1.10B,rhombi). This significance was determined by fulfillment of two conditions: more non-delivery than delivery of US after AP generation in response to the CS+, and more delivery than nondelivery of US after AP failure. After trials 7-10, the trained neuron acquired sufficient information for generating a conclusion about its participation in the instrumental reaction. Nevertheless, it still did not form a correct reaction (Fig. 1.9). It only began to search for an effective instrumental reaction, which may consist of the generation or failure of an AP. Note that trained and control neurons initially had a kindred function.

Statistical significance of information collected by neurons during classical (A) and instrumental conditioning (B,C). The significance of a difference in the number of events corresponded to the presence or absence of the properties examined and was evaluated by the test for binary sequences. Significance of the data accumulated during the given number of trials (abscissa) was calculated for each neuron and mean values from the whole sample of neurons (banned every 3 trials) and confidence intervals (p < 0.05) are shown. A - squares (with circle) demonstrate the statistical significance of the conclusion that neurons generate or fail to generate an AP with different probability in response to CS+ always followed by a US. Rhombi (with circle) designate the significance of the difference in the numbers of times the US was given in the responses to the CS+ and CS~. B - rhombi give the significance of trained neuron participation in generation of the instrumental reaction. This was determined by the product of two probabilities, the presence or absence of a US after an AP generation and the presence or absence of a US after an AP failure. Presence of this regularity is not sufficient, since absence of a US after an AP generation may be due to habituation, while presence of a US after an AP failure may indicate classical conditioning. Squares correspond to the significance of control neuron participation in generation of the instrumental reaction. This was determined by the product of two probabilities: the probability of trained neuron participation in the reaction and the probability that trained and control neurons either do or do not generate an AP in the same trial. C rhombi as in A. Acquisition is slower than in A, since not each CS+ was followed by the US. This depended on the presence of the instrumental reaction generated by the trained neuron. Squares plot the significance of the conclusion that during the course of instrumental conditioning the control neuron generated or failed to generate an AP in response to the CS+ with different probability in trials in which a US was given. This conclusion indicates that the control neuron did not respond to an instrumental procedure as if it were classical conditioning. The Fig. 1.10 was redrawn in accordance with the data [1260].

Fig. 1.10. Statistical significance of information collected by neurons during classical (A) and instrumental conditioning (B,C). The significance of a difference in the number of events corresponded to the presence or absence of the properties examined and was evaluated by the test for binary sequences. Significance of the data accumulated during the given number of trials (abscissa) was calculated for each neuron and mean values from the whole sample of neurons (banned every 3 trials) and confidence intervals (p < 0.05) are shown. A – squares (with circle) demonstrate the statistical significance of the conclusion that neurons generate or fail to generate an AP with different probability in response to CS+ always followed by a US. Rhombi (with circle) designate the significance of the difference in the numbers of times the US was given in the responses to the CS+ and CS~. B – rhombi give the significance of trained neuron participation in generation of the instrumental reaction. This was determined by the product of two probabilities, the presence or absence of a US after an AP generation and the presence or absence of a US after an AP failure. Presence of this regularity is not sufficient, since absence of a US after an AP generation may be due to habituation, while presence of a US after an AP failure may indicate classical conditioning. Squares correspond to the significance of control neuron participation in generation of the instrumental reaction. This was determined by the product of two probabilities: the probability of trained neuron participation in the reaction and the probability that trained and control neurons either do or do not generate an AP in the same trial. C rhombi as in A. Acquisition is slower than in A, since not each CS+ was followed by the US. This depended on the presence of the instrumental reaction generated by the trained neuron. Squares plot the significance of the conclusion that during the course of instrumental conditioning the control neuron generated or failed to generate an AP in response to the CS+ with different probability in trials in which a US was given. This conclusion indicates that the control neuron did not respond to an instrumental procedure as if it were classical conditioning. The Fig. 1.10 was redrawn in accordance with the data [1260].

Up to trials 15-18, both the trained and the control neurons decreased their reaction to the CS+. A temporal coincidence between output reaction and the US may be more easily interpreted by a neural system as a possible instrumental reaction than a temporal coincidence between absence of output reaction and the US. Absence of a reaction fails to provide information about which reaction must be generated in order to prevent a US appearance. In our experiments, when a CS+ failed to generate an AP in the trained neuron and caused a US delivery, it also induced APs in many other neurons, including the control neuron and other similar neurons, as well as those producing a postsynaptic potential during AP failure in the trained neuron (Fig. 1.7). Therefore, AP failure in response to the CS+ may be the first attempt of the neural system to counteract a US appearance after an AP generation. However, if a temporal coincidence between an AP and the US is more effective for generating the supposed instrumental reaction, then the neural system must decrease its reaction to the CS+.

Determining which neuron should be responsible for an instrumental reaction is a further problem that a neural system needs to resolve. Although our experimental procedure was directed to the trained neuron, reactions of both trained and control neurons were weakly, but very significantly, correlated (coefficient of correlation r627 = 0.26,p < 0.0001). Therefore, control and trained neurons sometimes generated similar reactions and control neurons acquired erroneous information as if its response participated in the instrumental reaction. In order to evaluate to what extent the control neuron received information that its participation is important for the generation of the instrumental reaction, we compared the number of events during which the control and trained neurons generated similar reactions (both generated or both failed to generate an AP in responses to the CS+ in the same trials) and the number of events when reactions of the control and trained neurons did not coincide with respect to AP generation in response to the CS+ (Fig. 1.10B,squares). At the beginning of training and up to 15 trials, it was uncertain whether a control neuron participates in the instrumental reaction (probability of participation was around 0.5). In the middle of training, when a control neuron generated an erroneous instrumental reaction, the trained neuron decreased its participation in the reaction. At this point, control and trained neurons failed to generate APs in counterphase. However, till the very end of training the control neuron did not acquire information as to whether it participated in the instrumental reaction or not. At the same time, participation of the trained neuron was highly significant (Fig. 1.10B,rhombi).

During instrumental conditioning the appearance of a CS+ was not always followed by a US, because this depended on generation or failure of an AP in the trained neuron’s response to the CS+. Therefore, the difference between the responses to the CS+ and CS- (Fig. 1.10C,rhombi) was acquired later than in the case of classical conditioning (Fig. 1.10A,rhombi). However, the control neurons preferred to respond to the CS-, rather to the CS+ earlier than it began to generate an erroneous instrumental reaction to the CS+ (Fig. 1.8). As pointed out earlier, during instrumental conditioning the control neuron did not receive reliable information about its participation in the instrumental reaction (Fig. 1.10B,squares). At the beginning of training, the experimental procedure for the control neuron may look like classical conditioning with partial reinforcement. The control neuron acquired this information during trials followed by the US. Fig. 1.10C (squares) demonstrates the significance of the conclusion that the control neuron showed different probabilities of generating or failing to generate an AP in response to the CS+ (in the trials followed by the US). This conclusion was not significant up to 10 trials and, therefore, the stimulus paradigm it received corresponded to a classical paradigm. However, the control neuron could not elaborate classical conditioning, since up to the 10th trial the contingency between CS+ and US was still absent (Fig. 1.10C,rhombi). In the middle of training, when a control neuron generated an erroneous instrumental reaction, it received significant information that a US was present after a CS+ when the neuron generated or failed to generate an AP with different probabilities (Fig. 1.10C,squares). This situation ceased to correspond to the classical conditioning paradigm.

During classical conditioning, a CS+ always preceded the US, independently of generation or failure of an AP in the response to the CS+ and, therefore, there is no difference in the number of times a US was given after AP generation in response to the CS+ and after AP failure in response to the CS+. During successive trials, the neurons accumulated data that a US appearance does not depend on their reaction to the CS+. During the first half of training, the neurons showed the same probability for generating or failing to generate an AP (Fig. 1.10A,squares). Therefore, AP generation in response to the CS+ was not the product of an instrumental reaction. At the end of classical conditioning, the neurons generated or failed to generate an AP with different probabilities (neurons mainly did generate APs). However, at this time, the nervous system was responding to the experimental procedure as if it were classical conditioning.

These observations revealed distinct differences between a classically versus an instrumentally conditioned increase in spike number. Inasmuch as these differences were observed in the same identified neurons, they strongly suggest that the mechanisms that produce the conditioned changes are also different. For example, a cellular response to a painful US increases during training only in the neurons responsible for the instrumental reaction [1260]. It might be expected that the dynamics of responses to the US reveal the greatest difference between the two forms of learning. During classical conditioning [527, 1325, 508, 401] and instrumental conditioning [553, 1245], a CS+ paired with a harmful US reduced behavioral and neuronal responses to the harmful US and thus had a defensive function. As for an instrumental reaction, it prevents the appearance of a US.

Learning makes easy a choice of profitable behavior, because the sense of learning consists in the prediction of an environmental reaction. Our data demonstrate that even behavior controlled by a single (trained) neuron develops as if evaluation of the future result of an instrumental action restricted by a given neuron is predicted by the processes running within this neuron.

Next post:

Previous post: