Digital Signal Processing Reference
In-Depth Information
Chapter 3
A SPOKEN DIALOG CORPUS FOR CAR
TELEMATICS SERVICES
Masahiko Tateishi 1 , Katsushi Asami 1 , Ichiro Akahori 1 , Scott Judy 2 , Yasunari
Obuchi 3 , Teruko Mitamura 2 , Eric Nyberg 2 , and Nobuo Hataoka 4
1 Research Laboratories, DENSO CORPORATION, 500-1, Minamiyama , Nisshin, Aichi, 470-
0111, Japan; 2 Language Technologies Institute, Carnegie Mellon University, Pittsburgh, PA
15213-3890, USA; 3 Advanced Research Laboratory, Hitachi Ltd., 1-280, Higashi-koigakubo,
Kokubunji, Tokyo, 185-8601, JAPAN; 4 Central Research Laboratory, Hitachi Ltd., 1-280,
Higashi-koigakubo, Kokubunji, Tokyo, 185-8601, JAPAN
Email: mtatei@rlab.denso.co.jp
Abstract:
Spoken corpora provide a critical resource for research, development and
evaluation of spoken dialog systems. This chapter describes the spoken dialog
corpus used in the design of CAMMIA (Conversational Agent for Multimedia
Mobile Information Access), which employs a novel dialog management
system that allows users to switch dialog tasks in a flexible manner. The corpus
for car telematics services was collected from 137 male and 113 female
speakers. The age distribution of speakers is balanced in the five age brackets of
20's, 30's, 40's, 50's, and 60's. Analysis of the gathered dialogs reveals that the
average number of dialog tasks per speaker was 8.1. The three most frequently-
requested types of information in the corpus were traffic information, tourist
attraction information, and restaurant information. Analysis of speaker
utterances shows that the implied vocabulary size is approximately 5,000
words. The results are used for development and evaluation of automatic
speech recognition (ASR) and dialog management software.
Keywords:
Spoken Dialog Corpus, Telematics, Speech Recognition, Dialog Tasks.
Search WWH ::




Custom Search