Graphics Reference
In-Depth Information
because it does not seem to have an obvious communicative function,
but it may also be a signal of the agent being bored or having nothing
to say, in which case it carries communicative meaning. Annotations
thus differ depending on whether the gestures are interpreted as
being intentionally communicative by the communicator (displayed or
signaled) (Allwood, 2001), or the gestures are judged (by the annotator)
to have a noticeable effect on the recipient.
Since emerging technology allows recognition of gestures and
faces via cameras and sensors, it is possible to extract gestures and
face expressions from the data. The form-features of gestures can
then be automatically linked to appropriate communicative functions.
Combining the top-down approach, i.e. manual annotation and
analysis of the data, with the bottom-up analysis of the multimodal
signals, we can visualize the speakers' communicative behavior, and
also show how synchrony of conversation is constructed through
the interlocutors' activity (Jokinen and Scherer, 2012). The top-down
approach uses human observation, e.g. takes video recordings which are
manually tagged according to some scheme, to mark communicatively
important events. The bottom-up approach, on the other hand, uses
automatic technological means to recognize, cluster, and interpret the
signals that the communicating agents emit. These two approaches
look at the communicative situations from two opposite viewpoints:
they use different methods and techniques, but the object of study is
the same. Communication models can thus be built and incorporated
into smart applications through top-down human observations and
bottom-up automatic analysis of the interaction, and the approach is
beneficial for both interaction technology and human communication
studies. New algorithms and models will be required for the detection
and processing of speech information along with gestural and facial
expressions, and existing technologies will need to be adapted to
accommodate these advances. Simulations based on manual analysis
of corpora on gestures and face expressions are already incorporated
within the development of Embodied Conversational Agents (e.g.,
André and Pelachaud, 2010), and a motion capture tool to gather
more precise data about the user's behavior is described in Csapo et
al. (2012).
2.3 Interdependence: Modal and multimodal annotation
Two kinds of annotation of interaction data can be considered. The
first is uni-modal annotation that is specific to a particular modality,
e.g. dialogue act annotation or gesture annotation, and the second one
is multimodal annotation proper, which takes the relation between
Search WWH ::




Custom Search