Multimodal Dialogue System Assessment - Man-Machine Dialogue: Design and Challenges

Information Technology Reference

In-Depth Information

Chapter 10

Multimodal Dialogue System Assessment

Whether it happens at the very end of the design, on the final system or

during the design itself, on prototypes or system modules, assessment is

meant to measure performances, compare these performances with those of

existing systems and find the strong and weak points. If the means allow it,

the latter can lead to taking up a design phase again to improve the system.

While it is often disparaged, maybe because it touches on raw nerves,

assessment can also bring a precious point of view and usable methods to

design and implementation. When talking of the assessment of referring

expression automatic generation algorithms, Krahmer and Van Deemter,

[KRA 12] observe that the first works were not clear about the parameters

used in the algorithms, and that it is when the assessments were carried out

that the researchers had to lay out all their cards, and describe their favorite

parameters. Assessment campaigns also helped bring together researchers

focusing on the same issues and thus participate in the general dynamics.

However, it comes with many constraints that can discourage some. For

example, comparing several systems requires the projection of results for

each system toward a common formalism that will allow for comparisons.

Yet, this projection can be very costly in terms of time and does not bring

anything to the system itself.

In this chapter, we will not present the assessment methods specific to

each module in a dialogue system, such as the assessment methods of a

speech recognition engine, those of an anaphora resolution module or even

Search WWH ::

Custom Search

Home