ROBUST ASR INSIDE A VEHICLE USING BLIND PROBABILISTIC BASED UNDER-DETERMINED CONVOLUTIVE MIXTURE SEPARATION TECHNIQUE - DSP for In-Vehicle and Mobile Systems

Digital Signal Processing Reference

In-Depth Information

1.

INTRODUCTION

Spoken dialogue information retrieval applications are becoming popular

for mobile users especially, in automobiles. Due to the typical presence of

background noise, echoes, and other interfering signals inside a car, speech

recognition accuracy reduces significantly. Since it is very hard to know a

priori (a) how the acoustic environment inside a car is changing, (b) the

number of interfering signals that are present, and finally, (c) how they get

mixed at the microphone (sensor), it is not practical to train recognizers for

the appropriate range of typical noisy environments. Therefore, it is

imperative that the ASR systems are robust to mismatches in training and

testing environments. One solution to robustness is speech enhancement

based on spectral estimation followed by subtraction. The speech signal

enhancement techniques developed so far (a) remove noise by estimating it

in the absence of speech (e.g.., [1]) and (b) separate noise i.e., interference

signals from the intended signals (e.g., [2]).

In this Chapter, we consider the problem of signal enhancement as the

separation of mixed signals (instead of subtracting the noise effect by

estimating it) that are received by an array of typically two microphones. For

this, it is necessary to apply blind techniques since the nature and the number

of signals and the environment which is mixing these signals are not known a

priori. Most of the blind techniques developed so far are based on

Independent Component Analysis (ICA). These techniques work well when

the number of microphones is equal to the number of signals (intended

speech signal plus unintended interfering signals). Since it is not practical to

know the number of signals present before hand and also this number could

be dynamically changing, the techniques based on ICA are not very

appropriate in real-life applications. In addition, due to the chamber effect

inside a car signals get mixed convolutively. Hence, we need techniques that

can separate convolutively mixed signals and work well when the number of

microphones is less than the number of signals present. The blind techniques

that work when the number of microphones is less than the number of signals

is referred to as under-determined Blind Source Separation (BSS). We have

developed one such technique using a probabilistic approach in the sparse

domain [3].

In this chapter, we apply that technique for signal enhancement. In the

next section, an over view of this technique is provided. In section 3, data

collection details inside a vehicle are provided. Section 4 provides the details

of ASR experiment and the results. In this section the recognition accuracy

results obtained using with and without convolutive under-determined BSS

Search WWH ::

Custom Search

Home