CONSTRUCTION AND ANALYSIS OF A MULTI-LAYERED IN-CAR SPOKEN DIALOGUE CORPUS - DSP for In-Vehicle and Mobile Systems

Digital Signal Processing Reference

In-Depth Information

6.

SUMMARY

In this chapter‚ we have presented brief description of a multimedia corpus

of in-car speech communication developed in CIAIR at Nagoya University‚

Japan. The corpus consists of synchronously recorded multi-channel

audio/video signals‚ driving signals‚ and a differential GPS reading. For a

restaurant information query task domain speech dialogues were collected

from over 800 drivers -equal split between male and female drivers- in four

different modes‚ namely‚ human-human and human-machine‚ prompted‚ and

natural. In addition‚ we have experimented with an ASR system for collecting

human-machine dialogues. Every spoken dialogue is transcribed with precise

time stamp.

We have proposed the concept of a Layered Intention Tag (LIT) for

sequential analysis of dialogue speech. Towards that end‚ we have tagged one

half of the complete corpus with LITs. We have also attached structured

dependency information to the corpus. With these‚ in-car speech dialogue

corpus has been enriched to turn into a multi-layered corpus. By studying

different layers of the corpus‚ different aspects of the dialogue can be

analyzed.

Currently‚ we are exploring the relationship between an LIT and the

number of phrases and the occurrence rate of fillers with an objective of

developing a corpus based dialogue management platform.

ACKNOWLEDGEMENT

This work has been supported in part by a Grant-in-Aid for Center of

Excellence (COE) Research No. 11CE2005 from the Ministry of Education‚

Science‚ Sports and Culture‚ Japan. The authors would like to acknowledge

the members of CIAIR for their enormous contribution and efforts towards

the construction of the in-car spoken dialogue corpus.

REFERENCES

Nobuo Kawaguchi‚ Shigeki Matsubara‚ Kazuya Takeda‚ and Fumitada Itakura: Multimedia

Data Collection of In-Car Speech Communication‚ Proc. of the 7th European Conference on

Speech Communication and Technology (EUROSPEECH2001)‚ pp. 2027--2030‚ Sep.

2001‚ Aalborg.

Deb Roy: “Grounded” Speech Communication‚ Proc. of the International Conference on

Spoken Language Processing (ICSLP 2000)‚ pp.IV69--IV72‚ 2000‚ Beijing.

T.Kawahara‚ T.Kobayashi‚ K.Takeda‚ N.Minematsu K.Itou‚ M.Yamamoto‚ A.Yamada‚

T.Utsuro‚ K.Shikano : Japanese Dictation Toolkit: Plug-and-play Framework For Speech

Recognition R&D‚ Proc. of IEEE Automatic Speech Recognition and Understanding

Workshop (ASRU'99)‚ pp.393--396 (1999).

[1]

[2]

[3]

Search WWH ::

Custom Search

Home