Biomedical Engineering Reference
In-Depth Information
In the scheme there are many interacting processes, such as trajectory formation
and feedback error learning . The latter, in particular, is obviously characterized by
a supervised learning paradigm and thus we may think that its main element (the
trainable feedforward model ) is implemented in the cerebellar circuitry. The learn-
ing signal, in this case, is the discrepancy between the desired trajectory (the motor
intention) and the actual trajectory, determined by the combined body-environment
dynamics and measured by different proprioceptive channels. In a sense, the brain
acts as its own supervisor, setting its detailed goal and measuring the corresponding
performance: for this reason it is possible to speak of a self-supervised paradigm .
The underlying behavioural strategy is an active exploration of the space of move-
ments also known as babbling , in which the brain attempts to carry out randomly
selected movements that become the teachers of themselves.
On the other hand, the trajectory formation model cannot be analysed in the same
manner. It requires different maps for representing task-relevant variables, such as
the position of the objects/obstacles in the environment, the position of the body
with respect to the environment, and the relative position of the body parts. Most
of these variables are not directly detectable by means of specific sensory channels
but require a complex process of sensory fusion and dimensionality reduction .This
kind of processing is characteristic of associative cortical areas, such as the posterior
parietal cortex which is supposed to hold maps of the body schema and the external
world [54] as a result of the converging information from different sensory chan-
nels. The process of cortical map formation can be modelled by competitive Heb-
bian learning applied both to the thalamo-cortical and cortico-cortical connections:
The former connections determine the receptive fields of the cortical units whereas
the latter support the formation a kind of high-dimensional grid that matches the
dimensionality of the represented sensorimotor manifold. In a cortical map model
sensorimotor variables are represented by means of population codes which change
over time as a result of the map dynamics. For example, a trajectory formation pro-
cess can be implemented by means of a cortical map representation of the external
space that can generate a time varying population code corresponding to the desired
hand trajectory . Another map can transform the desired hand trajectory into the
corresponding desired joint trajectory , thus implementing a transformation of coor-
dinates from the hand space to the joint space . This kind of distributed architecture
is necessary for integrating multisensory redundant information into a task-relevant,
lower-dimensional representation of sensorimotor spaces. On top of this compu-
tational layer, that operates in a continuous way, there is a layer of reinforcement
learning that operates mostly by trial and error through two main operating mod-
ules: an actor that selects a sequence of actions and a critic that evaluates the reward
and influences the action selection of the next trial. The global coherence of such
multiple internal processes of adaptation, learning and control is guaranteed by an
effective mechanical interface with the environment which allows a bi-directional
flow of energy and information.
Search WWH ::




Custom Search