Information Technology Reference
In-Depth Information
Comparing the classification results of the fully trained RNNPB with the SVC re-
veals a superior performance of the support vector classifier. Nevertheless, it has to be
kept in mind that the maximum margin classifier cannot be used to generate or retrieve
time series. Interestingly, the error rate is lower if the recurrent network is only trained
with two object categories (Sec. 4.3). A potential explanation, besides random fluctu-
ations, could be that during training a common set of weights has to be found for all
object categories. This process presumably interferes, due to the challenging input data,
with the self-organization of the PB space.
A drawback of the presented model is that it currently operates on a fixed motor
sequence. It would be desirable if the robot performed motor babbling [17] leading
not only to a self-organization of the sensory space, but to a self-organization of the
sensorimotor space. A simple solution to this problem would be to train the network
additionally with the motor sequence most appropriate for an object, i.e. reflecting its
affordance [18]. This would lead to an even better classification result because the motor
sequences themselves would help to distinguish the objects from each other and, thus,
the emerging PB values would be arranged further apart in PB space (conversely, this
means currently it does not make sense to train the network with the identical motor
sequences in addition). However, that does not address the fact that the robot should
identify the object affordances, the movements characterizing an object, by itself.
In related research, Ogata et al. also extract multi-modal dynamic features of ob-
jects, while a humanoid robot interacts with them [19]. However, there are distinct dif-
ferences. Despite using fewer objects in total, the problem posed in our experiments
is considerably harder. Our toy bricks have approximately the same circumference and
identical color. Furthermore, they exist in two weight classes with an identical in-class
weight that can only be discriminated via multi-modal sensory information. We provide
classification results, compare the results to other methods (MLP and SVC) and eval-
uate the noise tolerance of the architecture. In addition, only prototype time series are
used for training (in contrast to using all single-trial time series), resulting in a reduced
training time. Further, it is demonstrated that, if the network has already acquired senso-
rimotor knowledge of certain objects, it is able to generalize and provide fairly accurate
sensory predictions for unseen ones (Fig. 7 right).
There are several potential applications of the presented model. As shown in Fig. 10,
the network tolerates noise very well. This fact can be exploited for sensor de-noising.
Despite receiving a noisy sensory signal, the robot still will be able to determine the PB
values of the class representative based on the Euclidean distance. In turn, these values
can be used to operate the RNNPB in retrieval mode (Sec. 2.2), generating the noise-free
sensory signal previously stored, which then can be processed further. In fact, Kording
and Wolpert suggested that the central nervous system combines, in nearly optimal
fashion, visual, proprioceptive and other sensory information to overcome sensory and
motor noise [20]. Next to their Bayesian framework an RNNPB might also be a possible
way to model this 'de-noising' happening in the brain.
In conclusion, we present a promising framework for object classification based on
action-driven perception implemented on a humanoid robot. The underlying design
principles are rooted in neuroscientific and philosophical hypotheses.
 
Search WWH ::




Custom Search