Graphics Reference
In-Depth Information
as out-of-sync in either preparation, and/or stroke movement phase.
The evaluation showed that most of the generated behavior of
the proposed PLATTOS system can be assessed as viable, and close
to human-like.
7. Conclusion
This chapter presented a novel TTS-driven non-verbal behavior
system for coverbal gesture synthesis. The system's architecture and
grammar, used to synchronize the non-verbal expressions with verbal
information in symbolical and temporal domain, were presented in
detail. Further, we have presented how meaningful parts of verbal
content are identified and selected based on word-type-order, and
the semiotic patterns within the proposed system. We have also
described how a visual representation of meaning can be selected,
how the structure of its propagation can be generated as sequence
movement-phases (based on lexical affiliation and semiotic rules),
and how movement-phases and durations of movements can be
aligned with the verbal content. Finally, we have explained how a
procedural script is formed that drives the synchronized verbal and
coverbal behavior. The generated synthetic behavior already reflects
a very high-degree of lip-sync and iconic, symbolic, and indexical
expressions, as well as adaptors. As proven by the evaluation, most
of the generated behavior appears 'natural', and may adequately
represent the verbal content. As part of our future works we will
investigate the possibility of dynamically adjusting the degree of visual
pronunciation in regard to accenting the phonemes (visemes). Since
EVA-SCRIPT and ECA-EVA already support this feature, most of the
investigation will be oriented towards expressive TTS models. In order
to improve the rules stored within semiotic grammar, we will further
annotate our video corpora in order to further fine-tune existing rules,
especially regarding the movement dynamics, and additional shapes
representing the meanings of words and word phrases. Some effort
will also be directed towards further richness of the presented gestural
dictionary. By annotating additional segments of video corpora, we will
create additional gesture-instances. This will further contribute to the
diversity (that is typical for naturalness), and expressive capabilities
of ECAs. Additionally, we will try to incorporate several other part-
of-speech attributes for each word-type. This should not only enrich
the non-verbal behavior but also improve those processes used for
selecting the meaningful word, the semiotic type of movement, and
the position of a meaningful shape (stroke phase).
Search WWH ::




Custom Search