Information Technology Reference
In-Depth Information
give them a unique identifier as it is managed by the system: “put this object?”
+ gesture in (x 1 , y 1 ); “put this file?” + gesture in (x 1 , y 1 ); “put 'submis.tex'?”
(no gesture); “put obj 4353 ?” (no gesture); etc. The assessment procedure thus
includes natural language paraphrase of a multimodal reference. What remains
simple for the deictic gesture is much less simple for other types of co-verbal
gestures. Let us imagine, for example, that “put that there,” or rather “move
that there” so the example is not too complicated, is accompanied with a
single gesture going from the object to be moved and ending at the destination.
According to a first hypothesis that takes up the presentation of section 6.2.2,
this gesture trajectory is considered to be the materialization of the necessary
transition between pointing at an object and pointing at a location. In this
case, only the ends of the curve are used during the semantic analyses: the
point (x 1 , y 1 ) then the object present in this point or its immediate vicinity
are unified with “that”, and the point (x 2 , y 2 ) is unified with “there”. In other
words, we return to the previous case. According to a second hypothesis, the
trajectoryisconsideredtobeacombinationofthesetwopointinggestureswith
an illustrating co-verbal gesture providing a characteristic on the movement
action, that is the path (or points of passage) to follow. The trajectory is
analyzed from a temporal point of view (curve generated in a regular manner,
with no significant stop) and a structural point of view (arc), before being
unified with “move”, that is interpreted as a movement path. If we wish to
test this multimodal system's functionality, we only need to ask an additional
question Q: “follow this trajectory?” or “move along these points of passage?”,
by taking up the full gestures in one case or the other. The only inconvenience
applies to all the DQR methodologies, that is the need for the system to process
such questions.
- Level3=inference. This is about the construction of the utterance's full
meaning, the difficulty lies in the identification of allusions, an identification
that calls upon common sense reasoning and pragmatic inferences. With D =
“I would like a return ticket for Paris”, the authors suggest Q = “would like
ticket?” This aspect is independent from the communication modalities and is
still valid for the multimodal dialogue.
-Level4=illocutionary act interpretation. Here we enter the levels of
dialogue, with a first aspect concerning speech acts and the system's ability to
identify the correct type of act, even in the case of an indirect act. With D =
“a ticket for Paris”, which can follow a question or match an initial request,
the question Q = “is this a request?” allows the system to assess the act it
Search WWH ::




Custom Search