Digital Signal Processing Reference
In-Depth Information
system is an in-vehicle, naturally spoken mixed initiative dialog system to
obtain real-time navigation and route planning information using GPS and
information retrieval from the WWW. A proto-type in-vehicle platform was
developed for speech corpora collection and system development. This
includes the development of robust data collection and front-end processing
for recognition model training and adaptation, as well as a back-end
information
server to obtain interactive automobile route
planning
information from WWW.
The novel aspects presented in this chapter include the formulation of a
new microphone array and multi-channel noise suppression front-end,
environmental (sniffer) classification for changing in-vehicle noise
conditions, and a back-end navigation information retrieval task. We also
discuss aspects of corpus development. Most multi-channel data acquisition
algorithms focus merely on standard delay-and-sum beamforming methods.
The new noise robust speech processing system uses a five-channel array with
a constrained switched adaptive beamformer for the speech and a second for
the noise. The speech adaptive beamformer and noise adaptive beamformer
work together to suppress interference prior to the speech recognition task.
The processing employed is capable of improving SegSNR performance by
more than 10dB, and thereby suppress background noise sources inside the
car environment (e.g., road noise from passing cars, wind noise from open
windows, turn signals, air conditioning noise, etc.).
This chapter is organized as follows. In Sec. 2, we present our proposed
in-vehicle system. In Sec. 3, we discuss the CU-Move corpus. In Sec. 4, we
consider advances in array processing, followed by environmental sniffing,
and automatic speech recognition (ASR), and our dialog system with
connections to WWW. Sec. 5 concludes with a summary and discussion of
areas for future work.
2.
CU-MOVE SYSTEM FORMULATION
The problem of voice dialog within vehicle environments offers some
important speech research challenges. Speech recognition in car environments
is in general fragile, with word-error-rates (WER) ranging from 30-65%
depending on driving conditions. These changing environmental conditions
Search WWH ::




Custom Search