Optimal Multi-Microphone Speech Enhancement in Cars - Digital Signal Processing for In-Vehicle Systems and Safety

Digital Signal Processing Reference

In-Depth Information

factored into a sufficient statistics followed by a single-microphone post-filter.

As a straightforward extension of [ 1 ], if we know the RIRs, optimal estimation

of the speech signal can be achieved using the simple two-step method. How-

ever, it is actually not easy to satisfy the assumption of the known RIRs. In this

chapter, we address a realistic implementation of the sufficient statistics with

unknown RIRs.

If we know the source signal, we can adaptively estimate the RIRs based on an

acoustic echo cancelation scheme [ 4 ]. Because more correctly beamformed output

is nearer to the original source signal, we might be able to use the beamformed

output as a reference signal to estimate the RIRs [ 5 ]. In this chapter we propose

using a delay-and-sum beamformer (DSB) to provide the information necessary for

an initial constrained estimate of the RIR, which is then updated iteratively using

a multi-path generalized sidelobe canceller (GSC) based on the evolving RIR

estimate. Good RIR estimation makes the multi-path GSC more accurate, and

this again guarantees better RIR estimation. We demonstrate that, with a reasonable

constraint on the sparsity of the room impulse response, the algorithm converges to

a useful approximate RIR. Even though we may not get perfect RIR identification,

the converged RIR is nevertheless sufficient to compute coefficient vectors for a

multi-path fixed beamformer (FBF) which outperforms the naive DSB. By

leveraging the converged RIR, we are able to mitigate the common practical

problem of multi-path GSC, namely, its tendency to cancel the target signal due

the indistinguishability of signal from reverberation at the beamformer.

To visualize the situation in a tractable way, we first show the convergence of a

simplified version of the proposed scheme. A simple simulation test shows that this

method achieves sufficient blind deconvolution at the output of FBF. We then

evaluate the proposed algorithm using real-world moving-car recordings [ 6 ].

13.2 Proposed Method

13.2.1 Multi-path GSC

Multi-path GSC can be formulated as an optimization problem as shown in ( 13.1 ),

which is a generalized version of GSC [ 7 ] under a known multi-path environment,

represented by the RIR as coded into a constraint matrix C:

n

o

¼ f

w T

T

subject to C T

argmin

~

E

~

y

ð

n

Þ~

y

ð

n

Þ

~

w

~

w

;

(13.1)

w

w T

is an estimated source signal at the current time n , f

where

s

^

ð

n

Þ¼~

~

y

ð

n

Þ

¼

T .

½

is a noisy signal vector measured by the microphone array,

the array of filter coefficients is

10

0

y

~

ð

n

Þ

T encoding the estimated

L-tap inverse RIR filters for all of the N recorded signals, and

~

w

¼

½

w 1 w 2

w NL

Digital Signal Processing for In-Vehicle Systems and Safety

Search WWH ::

Custom Search

Home