Action-Driven Perception for a Humanoid - Agents and Artificial Intelligence

Information Technology Reference

In-Depth Information

Comparing the classification results of the fully trained RNNPB with the SVC re-

veals a superior performance of the support vector classifier. Nevertheless, it has to be

kept in mind that the maximum margin classifier cannot be used to generate or retrieve

time series. Interestingly, the error rate is lower if the recurrent network is only trained

with two object categories (Sec. 4.3). A potential explanation, besides random fluctu-

ations, could be that during training a common set of weights has to be found for all

object categories. This process presumably interferes, due to the challenging input data,

with the self-organization of the PB space.

A drawback of the presented model is that it currently operates on a fixed motor

sequence. It would be desirable if the robot performed motor babbling [17] leading

not only to a self-organization of the sensory space, but to a self-organization of the

sensorimotor space. A simple solution to this problem would be to train the network

additionally with the motor sequence most appropriate for an object, i.e. reflecting its

affordance [18]. This would lead to an even better classification result because the motor

sequences themselves would help to distinguish the objects from each other and, thus,

the emerging PB values would be arranged further apart in PB space (conversely, this

means currently it does not make sense to train the network with the identical motor

sequences in addition). However, that does not address the fact that the robot should

identify the object affordances, the movements characterizing an object, by itself.

In related research, Ogata et al. also extract multi-modal dynamic features of ob-

jects, while a humanoid robot interacts with them [19]. However, there are distinct dif-

ferences. Despite using fewer objects in total, the problem posed in our experiments

is considerably harder. Our toy bricks have approximately the same circumference and

identical color. Furthermore, they exist in two weight classes with an identical in-class

weight that can only be discriminated via multi-modal sensory information. We provide

classification results, compare the results to other methods (MLP and SVC) and eval-

uate the noise tolerance of the architecture. In addition, only prototype time series are

used for training (in contrast to using all single-trial time series), resulting in a reduced

training time. Further, it is demonstrated that, if the network has already acquired senso-

rimotor knowledge of certain objects, it is able to generalize and provide fairly accurate

sensory predictions for unseen ones (Fig. 7 right).

There are several potential applications of the presented model. As shown in Fig. 10,

the network tolerates noise very well. This fact can be exploited for sensor de-noising.

Despite receiving a noisy sensory signal, the robot still will be able to determine the PB

values of the class representative based on the Euclidean distance. In turn, these values

can be used to operate the RNNPB in retrieval mode (Sec. 2.2), generating the noise-free

sensory signal previously stored, which then can be processed further. In fact, Kording

and Wolpert suggested that the central nervous system combines, in nearly optimal

fashion, visual, proprioceptive and other sensory information to overcome sensory and

motor noise [20]. Next to their Bayesian framework an RNNPB might also be a possible

way to model this 'de-noising' happening in the brain.

In conclusion, we present a promising framework for object classification based on

action-driven perception implemented on a humanoid robot. The underlying design

principles are rooted in neuroscientific and philosophical hypotheses.

Agents and Artificial Intelligence

Search WWH ::

Custom Search

Home