Digital Signal Processing Reference
In-Depth Information
x ( n )
FM radio signal
PT S
LEM System
Model
TAP System
ASR
initialization,
SOU trigger
VAD Control
Notch Filter,
Noise Reduction,
VAD
h ( n )
v ( )
d ( n )
y ( n )
AEC,
Postfilter
s ( n )
Robust ASR
e ( n )
hands-free
microphone
n ( n )
Fig. 7.1 Block diagram of the talk-and-push (TAP) system
discrete-time domain, using n as discrete-time index at sampling frequency f s ¼
16 kHz,
the microphone signal can thus be expressed as the sum:
y
ð
n
Þ¼
s
ð
n
Þþ
d
ð
n
Þþ
n
ð
n
Þ
(7.1)
This relation is depicted on the bottom left of Fig. 7.1 .
To model the acoustic leaking from the loudspeaker into the microphone, we
assume that the echo signal d ( n ) results from the loudspeaker source signal x ( n )by
convolution with a discrete-time, time-variant impulse response
T
h
ð
n
Þ¼
½
h 0 ð
n
Þ;
h 1 ð
n
Þ; ...;
h N 1 ð
n
Þ
;
(7.2)
where N denotes the finite impulse response length and (·) T is the transpose.
For simplicity, a mono source signal x ( n ) is assumed. The impulse response h( n )
models the entire loudspeaker-enclosure-microphone (LEM) system—i. e., the
path from the digital-to-analog converter before the loudspeaker via the acoustic
enclosure to the analog-to-digital converter after the microphone.
Hence, the reverberated loudspeaker signal can be written as
h T
d n
ðÞ¼
ð
n
Þ
ð
n
Þ;
x
(7.3)
, x ( n - N + 1)] T is a
where
denotes the scalar product and x ( n )
¼
[ x ( n ), x ( n -1),
...
time-inverted segment of the loudspeaker signal of length N .
As shown in Fig. 7.1 , the first stage of the TAP system is an acoustic echo
cancelation (AEC) unit. It computes an estimate
d
of the echo component
according to [ 4 ] and subtracts it from the microphone signal.
For this purpose, the LEM system transfer function is estimated using the FDAF
described in Sect. 7.3 . The FDAF furthermore contains a postfilter, which reduces
residual echo components as well as some background noise n ( n ) present in the
microphone signal.
ð
n
Þ
 
Search WWH ::




Custom Search