Action-Driven Perception for a Humanoid - Agents and Artificial Intelligence

Information Technology Reference

In-Depth Information

perceived obstacles [3]. Similarly, pigeons bob their heads up and down to recover depth

information [4]. Not only living beings, but robots too are embodied [5], and they have

the ability to act and to perceive. In the presented experiments the robot actually needs

to act to perceive the objects it holds in its hand. The action-driven sensations are guided

by the physical properties of its body, the world and the interplay of both.

A humanoid robot moves toy bricks up and down and rotates them back and forth,

while holding them in its hand. The induced multi-modal sensory impressions are

used to train a modified version of a recurrent neural network with parametric bias

(RNNPB), originally developed by Tani and Ito [6]. The robot is able to self-organize

the contextual information and in turn, to use this learned sensorimotor knowledge for

object classification. Due to the overwhelming generalization capabilities of the recur-

rent architecture, the robot is even able to correctly classify unknown objects. Further-

more, we show that the proposed model is very robust against noise.

2Th o y

Despite its intriguing properties, the recurrent neural network with parametric bias has

hardly been used by anybody other than the original authors. Mostly, the architecture is

utilized to model the mirror neuron system [7,8]. Here we apply the variant proposed by

Cuijpers et al. [8] using an Elman-type structure [9] at its core. Furthermore, we modify

the training algorithm to include adaptive learning rates for training of the weights, as

well as the PB values. This results in an architecture that is more stable and converges

faster.

2.1

Storage

The recurrent neural network with parametric bias (an overview of the architecture un-

folded in time can be seen in Fig. 1) can be used for the storage, retrieval and recognition

of sequences. For this purpose, the parametric bias (PB) vector is learned simultane-

ously and unsupervised during normal training of the network. The prediction error

with respect to the desired output is determined and backpropagated through time using

the BPTT algorithm [9]. However, the error is not only used to correct all the synaptic

weights present in the Elman-type network. Additionally, the error with respect to the

PB nodes δ PB is accumulated over time and used for updating the PB values after an

entire forward-backward pass of a single time series, denoted as epoch e . In contrast

to the synaptic weights that are shared by all training patterns, a unique PB vector is

assigned to each individual training sequence. The update equations for the i -th unit of

the parametric bias pb for a time series of length T is given as:

T

δ PB

ρ i ( e +1)= ρ i ( e )+ γ i

i,t ,

(1)

t =1

pb i ( e ) = sigmoid( ρ i ( e )) , (2)

where γ is the update rate for the PB values, which in contrast to the original version

is not constant during training and not identical for every PB unit. Instead, it is scaled

Agents and Artificial Intelligence

Search WWH ::

Custom Search

Home