Digital Signal Processing Reference
In-Depth Information
Chapter 7
A Novel Way to Start Speech Dialogs
in Cars by Talk-and-Push (TAP)
Bal ´ zs Fodor, David Scheler, and Tim Fingscheidt
Abstract The obligation to press a push-to-speak button before issuing a voice
command to a speech dialog system is not only inconvenient but it also leads to
decreased recognition accuracy if the user starts speaking prematurely. In this
chapter, we investigate the performance of a so-called talk-and-push (TAP) system,
which permits the user to begin an utterance within a certain time frame before or
after pressing the button. This is achieved using a speech signal buffer in conjunc-
tion with an acoustic echo cancelation unit and a combined noise reduction
and start-of-utterance detection. In comparison with a state-of-the-art system
employing loudspeaker muting, the TAP system delivers significant improvements
in the word error rate.
Keywords Acoustic echo cancellation • Frequency-domain adaptive filter
(FDAF) • Noise reduction • Automatic speech recognition • In-car speech
dialog • Push-to-speak
7.1
Introduction
Modern in-car speech dialog systems require the user to press a push-to-speak
(PTS) button to initiate a dialog. The button press is normally followed by an
acoustic acknowledgment tone indicating that the user may start speaking.
In practice, this procedure often causes degraded system performance due to
nonconforming user behavior. For example, an inexperienced user cannot be
expected to wait for the acknowledgment tone before they start speaking. Instead,
the start of utterance (SOU) is likely to occur before the beep or, even worse, before
B. Fodor ( * ) • D. Scheler • T. Fingscheidt
Technische Universit
at Braunschweig, Institute for Communications Technology,
Braunschweig, Germany
e-mail: Fodor@ifn.ing.tu-bs.de ; scheler@ifn.ing.tu-bs.de ; fingscheidt@ifn.ing.tu-bs.de
Search WWH ::




Custom Search