Connectionist modeling (child development)

 

Introduction

Computational models provide a useful tool for developing and testing theories of cognitive and behavioral development. Building a model involves making precise and specific assumptions about the mechanisms underlying cognitive development. By implementing and testing the model as a running computer simulation, one can verify if these assumptions do indeed generate a behavior that is comparable to that of developing infants and children.

In recent years, connectionist models (artificial neural networks) have emerged as a paradigm that is especially attractive for modeling developmental change. This is because neural networks learn from data, and they develop their own internal representations as a result of interactions with an environment. After learning, they can generalize their knowledge to new instances. A model’s performance during learning and its generalization ability can then be compared to that of children to explain their behavior in terms of the model’s mechanisms. The model can also be used to generate novel predictions about development that can be tested empirically.

Principles of connectionist models

Connectionist models consist of a number of simple processing units with weighted connections between them. Activation flows from unit to unit via the connections. A unit becomes active when the activation flowing into it is either larger than a threshold value, or when it falls within a certain range. The models learn by adjusting the weights of the connections between the units. Items such as objects or words are represented in the models by patterns of activity over the units.

Connectionist models are loosely inspired by the functioning of neurons in the brain in that a large number of simple, non-linear associators with complex interconnections give rise to higher-level behavior.

However, the similarity between such networks and the brain should not be overstated as biological and artificial neurons differ in important aspects. It may be more useful to see connectionist models simply as complex associative learning systems with similar computational properties to the brain rather than as implementations of specific neurobiological ideas. For example, many connectionist models are very simple and only contain some 100 units, but this does not imply that the part of the brain solving the corresponding task only uses 100 neurons. Instead, individual units are sometimes taken to represent pools of neurons or cell assemblies. According to this interpretation, the activation level of the units corresponds to the proportion of neurons firing in the pool. Nevertheless, some models have aimed for a higher degree of biological plausibility, taking into account known connection pathways in the cortex, response properties of biological neurons, realistic methods for weight adjustment (Hebbian learning), and aspects of neural development.

There are four basic methods of training a neural network. In supervised learning, an external teaching signal is provided. The network gradually learns to associate a given input with this teaching signal by computing the discrepancy between its output and the teaching signal (i.e., the output error), and adjusting the connection weights so as to reduce this discrepancy. The most popular method for training a model in this way is the backpropagation algorithm (Rumelhart, Hinton, & Williams, 1986). A common architecture for supervised connectionist models is a three-layer feed-forward network (Fig. 1). It consists of an input layer that receives data from the environment, a hidden layer in which the model develops internal representations necessary to solve the learning task, and an output layer where an output is generated.

In unsupervised learning, there is no teaching signal, and the task of the network is often to detect similarities between different inputs or to cluster the input data. In reinforcement learning, there is no direct teaching signal, but only feedback (reward) about the success of a task performed by the network. Finally, self-supervised learning is similar to supervised learning, but here the teaching signal is generated by the network itself rather than being external. All of these training methods have been used in the modeling of development, and when building a model it is important to consider which method is appropriate for the task at hand.

A three-layered neural network. Different unit activations are indicated by gray scale.

Figure 1. A three-layered neural network. Different unit activations are indicated by gray scale.

Although they are associationist, connectionist models are not tabula rasa empirical learning systems. Each model involves a number of choices that form constraints on the learning task. These constraints concern, for example, the number of processing units and the structure of their interconnections, or the learning algorithm used to train the model. The selection of initial constraints is important because it will influence if and how the model learns a certain task. Any model involves abstraction, but the abstraction of an aspect of processing that is important for learning the task, or the introduction of constraints that do not exist in the real world task and that might make learning easier, can compromise the validity of a model. With this in mind, the chosen constraints can inform theories of cognitive development in children.

Representational constraints, such as the features used to represent objects in category learning, or the amount of overlap between the representations of different verbs in learning inflections, can shed light on which representations are necessary for the child to solve a certain task. Architectural constraints such as integration between orthographic and semantic components in learning to read, or the number of units in the model that constrain its overall capacity to learn, can show which resources are necessary for learning a task. Further constraints are processing constraints such as the specific unit activation function and the weight update rule, and environmental constraints that can give insights into the effect of the structure of the environment on learning (e.g., what can be learned based on a set of experimental stimuli in a within-task learning model, or what is learned from the experience in a structured environment such as the linguistic input to a child).

Varying the initial constraints of models has been used to explain abnormal development. For example, aspects of different types of developmental dyslexia have been successfully modeled in a connectionist model of normal reading by reducing the number of processing units (Harm & Seidenberg, 1999). Such models can therefore give new explanations for the causes of developmental disorders as well.

Specific connectionist models of development

Connectionist models have been used to model both behavioral and cognitive development. Examples of modeled behaviors are habituation (the decrease of response to a stimulus after repeated exposure) and perseveration (repetition of a behavior even after circumstances change and the behavior is no longer appropriate, such as in the A-not-B task) (Munakata, 1998).

Connectionist models of cognitive development have addressed the trajectory of development across a number of tasks. Examples include the balance scale task (Shultz, Mareschal, & Schmidt, 1994), the acquisition of the English past tense, learning the meaning of words, phonological development (Westermann & Miranda, 2004), the emergence of object-directed behaviors in infants (Mareschal, 2001), and concept and category acquisition. Models of within-task learning at a certain stage in development have also been devised, such as category learning and speech sound discriminations in 4- and 10-month-olds (Mareschal, 2001). In this way, connectionist models can explain both the mechanisms underlying cognitive change and those that are responsible for infant learning on a shorter time scale.

One recent example of within-task learning is a model of novelty preference. Infants direct more attention to unfamiliar and unexpected stimuli, and the standard interpretation of this behavior is that they are comparing an input stimulus to an internal representation of the same stimulus (Fig. 2B). A bigger discrepancy between the stimulus and the internal representation results in a longer looking time because the infant updates the latter to reduce the discrepancy. This behavior has been translated into connectionist models with so-called auto-encoder networks (Fig. 2A). These are simple three-layer feed-forward networks where the input signal and the target signal are the same (i.e., the model learns to reproduce the input at the output layer). To learn this task, the model develops internal representations in the form of activation patterns of the hidden units for all stimuli. When a novel stimulus is presented to the network, its output will be more different from the target signal, resulting in a higher output error. Here, the error in the model is equated with the looking time of the infant. With this paradigm, several aspects of categorization behavior in early infancy have been successfully modeled.

An auto-encoder neural network model (A) that implements the analogy of the infant looking time model (B).

Figure 2. An auto-encoder neural network model (A) that implements the analogy of the infant looking time model (B).

A cascade-correlation network.

Figure 3. A cascade-correlation network.

Another class of connectionist models has been based on recent evidence that the structural experience-dependent adaptation of the cortex in the first years of life plays a major role in cognitive change during development. These models change their structure while they learn a task, by adding or deleting units and connections.

These ‘constructivist’ models have been especially successful in accounting for stage-like development in Piagetian tasks, and for regressive behavior (the unlearning of previously learned behavior typical of U-shaped learning). Furthermore, constructivist models have been useful in formalizing Piaget’s notions of assimilation, accommodation, and equilibration.

An example for a constructivist connectionist network is the cascade-correlation algorithm (Fig. 3). This model starts out with a minimal architecture comprising only the input and output layers. It is trained in a supervised way, and when the task cannot be learned in the present architecture, a new, hidden, unit is inserted, and training continues with the new architecture. This process is repeated until the task has been learned. Each new unit is connected to all previously inserted units as well as the input units, and can thus develop into a higher-level feature detector that helps in learning the task. While the precise mechanisms of structural change in the cortex are not well understood, they are certainly different from those in constructivist neural network models. Nevertheless, the models highlight the importance of structural change and adaptation to the environment in cognitive development.

Conclusions

Since the early 1980s, connectionist models of development have become an indispensable tool for furthering our understanding of the mechanisms that underlie behavioral change. They draw inspiration from the functioning of biological neurons in the brain, but they also abstract from many aspects of real neurons and introduce some new constraints. An important step in modeling, which is sometimes omitted, is to re-translate a successful model into a developmental theory. This step specifies the relevant aspects of the model that have led to the observed behavior, and whether they find their counterpart in the real world. It also makes sure that the model does not succeed only because hidden assumptions have been made that have nothing to do with the modeled phenomenon.

All connectionist models of development are located in the large gap that looms in our understanding of the connections between brain and cognitive development. They can serve to narrow this gap, and a promising direction of future models might be to incorporate more biological plausibility while still addressing relatively high-level cognitive behavior. Existing models are biologically plausible to different degrees, but more plausible models often address low-level phenomena like visual or auditory processing. The challenge for connectionist models is to address high-level tasks that involve behavior normally explained with symbolic processing, such as reasoning, inference, analogy, and decision making.

Connectionist models have been successful in addressing many developmental phenomena, but they are mainly adapted to specific isolated tasks. A challenge is therefore to build models that account for multiple interacting tasks. Such models would have to address the question of how knowledge that has developed in separate domains is subsequently integrated, and how the models can capitalize on previously acquired knowledge. Another important question concerns the role of embodiment in learning. Connectionist models take their inspiration from learning and processing in the brain, but as such they are divorced from a body. However, proactive behavior of a learner, that is, manipulating the environment to generate new information and new learning situations, probably plays an important role in cognitive development. Therefore, a future direction of connectionist research should be to incorporate these models into embodied systems that have the ability to manipulate their environment.

Next post:

Previous post: