Digital Signal Processing Reference
In-Depth Information
1.
INTRODUCTION
The term telematics refers to the emerging industry of communication,
information, and entertainment services delivered to motor vehicles via
wireless network technology. A telematics system must provide a human-
machine interface (HMI) that allows drivers to operate the device, system or
service easily and without any risks regarding traffic safety. A spoken dialog
system is considered to be the most suitable HMI for telematics, since it
allows the driver to keep “hands on the wheel, eyes on the road”.
The Conversational Agent for Multimedia Mobile Information Access
(CAMMIA) provides a framework for client-server implementation of spoken
dialog systems in mobile, hands-free environments[1][5]. The goal of
CAMMIA is to realize large-scale speech dialog systems that can handle a
variety of information retrieval tasks. CAMMIA is based on VoiceXML, a
markup language for speech dialog systems which has been proposed as a
standard by W3C [7]. The client is an in-vehicle terminal with an automatic
speech recognition (ASR) system, a VoiceXML interpreter, and a text-to-
speech (TTS) system; the server is a separate computer which runs a Dialog
Manager (DM) module [5]. The client recognizes the driver's utterances
according to the VoiceXML dialog scenarios, and transmits the recognition
results in the form of requests to the server. The server then searches its
database and transforms the search results into VoiceXML files which are
transmitted to the client as a response.
One novel aspect of CAMMIA is the natural conversational interaction
between the user and the system, supported by a DM module that allows the
user to change dialog tasks flexibly. Many of the system requirements
associated with natural spoken dialog can be ascertained by studying human
behavior as observed in large collections of spoken or written data.
Specifically, the analysis includes defining a lexicon and grammar for ASR,
as well as designing suitable dialog scenarios for use by the DM.
Human-computer dialog differs from human-human dialog in various
aspects, including linguistic complexity[2]. However, the examination of
human-human dialogs is a natural first step in the process of modeling human
dialog behavior [3]. The modeling approach requires very large quantities of
task-oriented linguistic data. To meet this requirement, we collected a spoken
dialog corpus for car telematics services. In this Chapter, Section 2 outlines
the system architecture of CAMMIA. Section 3 explains the spoken dialog
corpus collection. Section 4 describes the analysis of the corpus, followed by
conclusions.
Search WWH ::




Custom Search