Graphics Reference
In-Depth Information
3.1 Model-based gesture generation
The first systems investigating the challenge of iconic gesture
generation were lexicon-based approaches. In general, these systems
were characterized by a straightforward mapping of meaning onto
gesture form. The Behavior Expression Animation Toolkit (BEAT) was
among the first of a new generation of toolkits to allow the generation
of synthetic speech alongside synchronized nonverbal behaviors, such
as hand gestures and facial displays, to be realized with an animated
human figure (Cassell et al., 2001). This approach of mapping text
onto multimodal behavior was characterized by representing linguistic
and social context and applying behavior generation rules based on
empirical results. A similar approach was taken with the Nonverbal
Behavior Generator (NVBG), proposed by Lee and Marsella (2006). The
system analyzes the syntactic and semantic structure of surface texts
and takes the affective state of the embodied agent into account to
generate appropriate nonverbal behaviors. Based on a study from the
literature and a video analysis of emotional dialogues, the authors
developed a list of nonverbal behavior generation rules. The Real
Estate Agent (REA) is a more elaborate system as it aims to model
the bi-directional process of communication (Cassell, 2000). That is,
in addition to the generation of nonverbal behaviors, the system also
seeks to understand aspects of these same modalities' use by a human
interlocutor. The focus of gesture generation in the REA system is the
context-dependent coordination of (lexicalized) gestures with speech,
accounting for the fact that gestures do not always carry the same
meaning as speech.
Relying on empirical results, the systems mentioned so far focus
on the context-dependent coordination of gestures with concurrent
speech, whereby gestures are drawn from a lexicon. Flexibility and
generative power of gestures to express new content, therefore, is
obviously very limited. A different attempt that is closely related to
the generation of speech-accompanying gestures in a spatial domain
is Huenerfauth's (2008) system which translates English texts into
American Sign Language (ASL) focusing on classifier predicates which
are complex and descriptive types of ASL sentences. These classifier
predicates have several similarities with iconic gestures accompanying
speech. The system also relies on a library of prototypical templates
for each type of classifier predicates in which missing parameters are
filled in adaptation to the particular context.
The NUMACK system (Kopp et al., 2007) tries to overcome
the limitations of lexicon-based gesture generation by considering
patterns of human gesture composition. Based on empirical results,
Search WWH ::




Custom Search