Information Technology Reference
In-Depth Information
the perceived quality of an ECA (see Chapter 9 of [GAR 11]), or of text to
speech, without forgetting the assessment of models and model derivation
processes used in design-time architectures (see section 4.2). Each domain,
even the very vast domain of interpretation or that of dialogue management,
has its own criteria, which refer to the techniques used and which are well
beyond the scope of this topic. We present the characteristics of an approach
aiming to assess an MMD system, with the oral interaction and multimodal
interaction specificities, taking into account unbiased aspects (speaking turns,
average duration of utterances, number of refusals) and biased aspects
(interview of a user on his feelings, filling out questionnaires), which are also
the topic of descriptive and inferential statistical analyses (see section 2.2 and
[JUR 09, p. 872]).
We first present the methods that are currently used, in oral and
multimodal MMD as well as in MMI (section 10.1), which lead us to
underline the weak points of the assessment and present the challenges for the
years to come (section 10.2) as well as a few paths that could be followed,
especially for multimodal dialogue (section 10.3).
10.1. Dialogue system assessment feasibility
We can see an increasing number of articles being published on oral and
multimodal dialogue system assessment. Assessment paradigms are
suggested, and they are broader and broader and more and more complex,
notably covering metrics, user tests or questionnaire analysis methods, with
questionnaires filled out by subjects after they have used the system. These
efforts are relevant and should be applauded, but we should not forget some
recurrent observations that are particularly true.
First observation: contrary to information extraction, speech recognition or
syntactic analysis systems, MMD systems often remain at the level of research
prototypes which are hard to make and operate correctly, and which are very
sensitive to user behavior. Apart from a few marginal, recreational examples,
there is not a single system that has been sold to the public and used in a
profitable manner by a large amount of people. In other words, scalability is
still a major issue in MMD and the assessments carried out limit themselves to
research prototypes or professional systems that are so task specific (such as
military systems) that they only concern an extremely small number of users.
Search WWH ::




Custom Search