Digital Signal Processing Reference
In-Depth Information
6.
SUMMARY
In this chapter‚ we have presented brief description of a multimedia corpus
of in-car speech communication developed in CIAIR at Nagoya University‚
Japan. The corpus consists of synchronously recorded multi-channel
audio/video signals‚ driving signals‚ and a differential GPS reading. For a
restaurant information query task domain speech dialogues were collected
from over 800 drivers -equal split between male and female drivers- in four
different modes‚ namely‚ human-human and human-machine‚ prompted‚ and
natural. In addition‚ we have experimented with an ASR system for collecting
human-machine dialogues. Every spoken dialogue is transcribed with precise
time stamp.
We have proposed the concept of a Layered Intention Tag (LIT) for
sequential analysis of dialogue speech. Towards that end‚ we have tagged one
half of the complete corpus with LITs. We have also attached structured
dependency information to the corpus. With these‚ in-car speech dialogue
corpus has been enriched to turn into a multi-layered corpus. By studying
different layers of the corpus‚ different aspects of the dialogue can be
analyzed.
Currently‚ we are exploring the relationship between an LIT and the
number of phrases and the occurrence rate of fillers with an objective of
developing a corpus based dialogue management platform.
ACKNOWLEDGEMENT
This work has been supported in part by a Grant-in-Aid for Center of
Excellence (COE) Research No. 11CE2005 from the Ministry of Education‚
Science‚ Sports and Culture‚ Japan. The authors would like to acknowledge
the members of CIAIR for their enormous contribution and efforts towards
the construction of the in-car spoken dialogue corpus.
REFERENCES
Nobuo Kawaguchi‚ Shigeki Matsubara‚ Kazuya Takeda‚ and Fumitada Itakura: Multimedia
Data Collection of In-Car Speech Communication‚ Proc. of the 7th European Conference on
Speech Communication and Technology (EUROSPEECH2001)‚ pp. 2027--2030‚ Sep.
2001‚ Aalborg.
Deb Roy: “Grounded” Speech Communication‚ Proc. of the International Conference on
Spoken Language Processing (ICSLP 2000)‚ pp.IV69--IV72‚ 2000‚ Beijing.
T.Kawahara‚ T.Kobayashi‚ K.Takeda‚ N.Minematsu K.Itou‚ M.Yamamoto‚ A.Yamada‚
T.Utsuro‚ K.Shikano : Japanese Dictation Toolkit: Plug-and-play Framework For Speech
Recognition R&D‚ Proc. of IEEE Automatic Speech Recognition and Understanding
Workshop (ASRU'99)‚ pp.393--396 (1999).
[1]
[2]
[3]
Search WWH ::




Custom Search