Graphics Reference
In-Depth Information
language technology that provides incremental interpretation of
partial utterances such as the work of Devault et al. (2011) which
provides a semantic interpretation, a measure of confidence in the
current understanding and a measure of whether continued listening
will lead to better understanding. The virtual human's reaction to the
understanding is a valenced reaction to the evolving interpretation of
the speaker's utterance. For example, if the virtual human interprets
the speaker's partial utterance as deliberately proposing an action to
harm the virtual human, then the reaction will be anger.
The model analyzes this information and triggers relevant listener
feedback rules, which are mapped to appropriate nonverbal behaviors,
such as nods for generic feedback and expressions of confusion,
comprehension, happiness or anger for the specific feedback. These
behaviors are also conditional on the listener's roles and goals. In
particular, a listener that is the main addressee and has the goals
of participating in and understanding the conversation will engage
in mutual gaze with the speaker, nod to signal attention and signal
comprehension and reaction to the content of the utterance. On the
other hand, an eavesdropper that has the goal of avoiding participation
in the conversation will avoid mutual gaze and signaling attention
with nods.
4. Interpersonal Dynamic: Speaker and
Listener Interaction
A great example of interpersonal dynamics is backchannel feedback,
the nods and para-verbals such as “uh-huh” and “mm-hmm” that
listeners produce as someone is speaking (Watzlawick et al., 1967).
They can express a certain degree of connection between listener
and speaker (e.g., rapport), a way to show acknowledgement (e.g.,
grounding) or they can also be used for signifying agreement.
Backchannel feedback is an essential and predictable aspect of natural
conversation and its absence can significantly disrupt participant's
ability to communicate (Bavelas et al., 2000). Accurately recognizing
the backchannel feedback from one individual is challenging since
these conversational cues are subtle and vary between people. Learning
how to predict backchannel feedback is a key research problem for
building immersive virtual human and robots. Finally, there are still
some unanswered questions in linguistic, psychology and sociology on
what triggers backchannel feedback and how it differs from different
cultures. In this chapter we show the importance of modeling both
the multimodal and interpersonal dynamics of backchannel feedback
for recognition, prediction and analysis.
Search WWH ::




Custom Search