Information Technology Reference
In-Depth Information
Chapter 10
Multimodal Dialogue System Assessment
Whether it happens at the very end of the design, on the final system or
during the design itself, on prototypes or system modules, assessment is
meant to measure performances, compare these performances with those of
existing systems and find the strong and weak points. If the means allow it,
the latter can lead to taking up a design phase again to improve the system.
While it is often disparaged, maybe because it touches on raw nerves,
assessment can also bring a precious point of view and usable methods to
design and implementation. When talking of the assessment of referring
expression automatic generation algorithms, Krahmer and Van Deemter,
[KRA 12] observe that the first works were not clear about the parameters
used in the algorithms, and that it is when the assessments were carried out
that the researchers had to lay out all their cards, and describe their favorite
parameters. Assessment campaigns also helped bring together researchers
focusing on the same issues and thus participate in the general dynamics.
However, it comes with many constraints that can discourage some. For
example, comparing several systems requires the projection of results for
each system toward a common formalism that will allow for comparisons.
Yet, this projection can be very costly in terms of time and does not bring
anything to the system itself.
In this chapter, we will not present the assessment methods specific to
each module in a dialogue system, such as the assessment methods of a
speech recognition engine, those of an anaphora resolution module or even
Search WWH ::




Custom Search