Graphics Reference
In-Depth Information
Figure 2. Affective Listener Alfred: the current user state is perceived using SSI, a framework
for social signal interpretation (Wagner et al., 2011b) framework (upper left window);
observed cues are mapped onto the valence and arousal dimensions of a 2D emotion model
(upper middle window); values for arousal and valence are combined to a fi nal decision
and transformed to a set of Facial Animation Coding System (FACS) parameters, which are
visualized by the virtual character Alfred (right window).
(Color image of this fi gure appears in the color plate section at the end of the topic.)
The fusion approach is inspired by that developed for the
Augmented Reality Tree. However, while Gilroy et al. (2011) generate
one vector per modality, Wagner et al. (2011b) generate one vector for
each detected event. This way they prevent sudden leaps in case of
a false detection. Since the strength of a vector decreases with time,
the influence of older events is lessened until the value falls under a
certain threshold and is completely removed.
5. Exploiting Social Signals for Semantic Interpretation
Few systems combine semantic multimodal fusion for task-based
command interpretation and multimodal fusion of social signals. A
few studies nevertheless mention some interaction between the two
communication streams. Such combinations occur in users' behaviors.
For example, a user may say “Thanks” to a virtual agent and at the
same time start a new command using gesture (Martin et al., 2006). In
another study about multimodal behaviors of users when interacting
with a virtual character embedded in a 3D graphical environment,
such concurrent behaviors were also observed. In such cases, speech
input was preferred for social communication with the virtual character
(“how old are you?”), whereas 2D gesture input was used in parallel
Search WWH ::




Custom Search