Digital Signal Processing Reference
In-Depth Information
1.
INTRODUCTION
The application of telematics in the car environment involves the
integration of onboard computer, onboard devices, global positioning and
wireless communication systems.
As a safe, reliable and comfortable interaction with these systems is of
particular relevance while driving, Automatic Speech Recognition (ASR)
technology in the car environment has gained more and more interest for the
emerging automotive applications appearing on the market.
Robustness and flexibility of hands-free ASR systems in adverse
environment are still challenging topics of research [1]-[4]. Speech signals
acquired by hands-free systems on a moving car are generally characterized
by low SNR and are affected by various sources of corruption. Engine and
tyres contribute mainly low frequency noise, while aerodynamic turbulence,
predominant at high speed, has a broader spectral content. Other noise
components are unstationary and unpredictable (e.g., road bumps, rain, traffic
noise, etc.).
A further reduction of the speech recognizer accuracy is caused by
acoustic effects of the car enclosure, spontaneous speech phenomena and
speaking style modifications (i.e. Lombard effect), especially in conjunction
with the word confusability induced by large vocabularies.
The European project VICO (Virtual Intelligent CO-driver) has the goal of
developing an advanced in-car dialogue system for the vocal interaction in
natural language with an agent able to provide services as navigation, route
planning, hotel and restaurants reservation, tourist information, car manual
consultation [5],[6]. The planned system includes a robust hands-free speech
recognizer, connected with a natural language understanding module allowing
for spontaneous speech interaction and an advanced and flexible dialogue
manager able to adapt itself to a wide range of dialogue situations. A further
module constitutes the interface for a dynamic information retrieval and an
efficient data extraction from databases containing geographic and tourist
information. Voice interaction can be in English, German or Italian. ITC-irst
has in charge the development of the ASR engine for Italian, while the
corresponding engines for English and German are developed by Daimler
Chrysler AG [7].
All the modules are integrated into a CORBA system architecture, and a
common interface was specified to connect the recognizers of different
languages to the same natural language understanding module. Due to the
need of alignment among language models for the different languages as well
as to the need of reducing the complexity while managing large vocabularies
(e.g., lists of streets and points of interest in a city, cities in a region, etc.), a
Search WWH ::




Custom Search