From Annotation to Multimodal Behavior - Coverbal Synchrony in Human-Machine Interaction

Graphics Reference

In-Depth Information

types of meaning it carries, and its function during communication

(McNeill, 1992; Poggi, 2007; Ekman and Friesen, 2003). An NVB is

characterized by the shape of the signals that composed it and their

associated meaning. Such link shape-meaning highly depends on

the discursive context. Poggi called the pairs (signals, meaning) a

communicative act. A communicative act may have several pairs (of

signals, meanings) attached to it. A lexicon is like a dictionary of

communicative acts that makes explicit the mapping between signals

and meanings. Most agent systems have lexicon built in their system

(Cassell et al., 1994; Cassell et al., 2001; Pelachaud, 2005). Lee and

Marsella (2006) proposed non-verbal behavior generator (NVBG) which

adds NVB based on a semantic analysis of the text to be said by the

agent. Lately Bergmann and Kopp developed a computational model

of multimodal behavior generator that outputs sentences with the

associated hand gestures. This model is based on a statistical analysis

of an annotated corpus of humans' dialogs on a specific topic (here

to give spatial direction). The virtual agent is able to create on the fly

complex iconic gestures relating to the route direction it describes.

4. Summary and Future Trends

This article focused on the annotation and representation of multimodal

behavior for the purpose of designing and developing virtual characters

and embodied agents. It has surveyed issues relevant to the annotation

task itself, and given detailed examples of mark-up languages for

emotion and behavior representation. We have distinguished an

annotation scheme from its representation, and emphasized the

dependence of the annotation scheme on the theory that described

how the communicative phenomena were categorized in the study. We

have also discussed general requirements for a multimodal annotation

framework, including extensibility, incrementality, and uniformity,

as well as presented different mark-up languages which go beyond

application-specific representations, by offering theoretically consistent

approaches ready for use when comparing and evaluating annotations.

On-going work on multimodal annotation focuses on developing

and extending annotation schemes further, by enhancing the existing

ones with more accurate and detailed feature specifications, and by

broadening the set of annotation categories to cover new phenomena.

For instance, active research to take place on topics such as emotion

and affective mark-up languages (Schröder et al., 2011), laughter

(Truong and Trouvain, 2012), analysis of audiovisual and paralinguistic

phenomena (an overview is given in Schuller et al., 2013), as well as

eye-gaze, turn-taking, and attention (Levitski et al., 2012; Bednarik

Search WWH ::

Custom Search

Home