Graphics Reference
In-Depth Information
to handle disagreeing modalities in a way so that competitive results
to the best channel could be achieved.
4.3 Offline versus online fusion
Characteristic of current research on the multimodal analysis of
social and emotional signals is the strong concentration on posteriori
analyses. Out of the many methods discussed in the recent analysis
by D'Mello and Kory (2012), hardly any one of them was tested in
an online scenario where a system responds to users' social and
emotional signals while they are interacting with it. The move from
offline to online analysis of social and affective cues raises a number
of challenges for the multimodal recognition task. While in offline
analysis the whole signal is available and analysis can fall back on
global statistics, such a treatment is no longer possible for online
analysis. In addition, offline analysis usually focuses on a small set of
pre-defined emotion classes and neglects, for example, data that could
not be uniquely assigned to a particular emotion class. Online analysis
has, however, to take into account all emotion data. Finally, while there
are usually no temporal restrictions for offline analysis, online analysis
has to be very fast to enable a fluent human-robot dialogue. A fusion
mechanism specifically adapted to the needs of online fusion has
been used in the Callas Emotional Tree, an artistic Augmented Reality
installation of a tree which responds to the spectators' spontaneous
emotions reactions to it; see Gilroy et al. (2008). The basic idea of this
approach is to derive emotional information from different modality-
specific sensors and map it onto the 3D of the Pleasure-Arousal-
Dominance model (PAD model) by Mehrabian (1980). In addition,
to the input provided by a modality-specific sensor at a particular
instance of time, the approach considers the temporal dynamics of
modality-specific emotions by integrating the current value provided
by a sensor with the previous value. The fusion vector then results
from a combination of the vectors representing the single modality-
specific contributions. Unlike traditional approaches to sensor fusion,
PAD-based fusion integrates contributions from the single modalities
in a frame-wise fashion and is thus able to respond immediately to a
user's emotional state.
4.4 Choice of segments to be considered
in the fusion process
Even though it is obvious that each modality has a different timing,
most fusion mechanisms either use processing units of a fixed duration
Search WWH ::




Custom Search