Digital Signal Processing Reference
In-Depth Information
1.
INTRODUCTION
Spoken dialogue information retrieval applications are becoming popular
for mobile users especially, in automobiles. Due to the typical presence of
background noise, echoes, and other interfering signals inside a car, speech
recognition accuracy reduces significantly. Since it is very hard to know a
priori (a) how the acoustic environment inside a car is changing, (b) the
number of interfering signals that are present, and finally, (c) how they get
mixed at the microphone (sensor), it is not practical to train recognizers for
the appropriate range of typical noisy environments. Therefore, it is
imperative that the ASR systems are robust to mismatches in training and
testing environments. One solution to robustness is speech enhancement
based on spectral estimation followed by subtraction. The speech signal
enhancement techniques developed so far (a) remove noise by estimating it
in the absence of speech (e.g.., [1]) and (b) separate noise i.e., interference
signals from the intended signals (e.g., [2]).
In this Chapter, we consider the problem of signal enhancement as the
separation of mixed signals (instead of subtracting the noise effect by
estimating it) that are received by an array of typically two microphones. For
this, it is necessary to apply blind techniques since the nature and the number
of signals and the environment which is mixing these signals are not known a
priori. Most of the blind techniques developed so far are based on
Independent Component Analysis (ICA). These techniques work well when
the number of microphones is equal to the number of signals (intended
speech signal plus unintended interfering signals). Since it is not practical to
know the number of signals present before hand and also this number could
be dynamically changing, the techniques based on ICA are not very
appropriate in real-life applications. In addition, due to the chamber effect
inside a car signals get mixed convolutively. Hence, we need techniques that
can separate convolutively mixed signals and work well when the number of
microphones is less than the number of signals present. The blind techniques
that work when the number of microphones is less than the number of signals
is referred to as under-determined Blind Source Separation (BSS). We have
developed one such technique using a probabilistic approach in the sparse
domain [3].
In this chapter, we apply that technique for signal enhancement. In the
next section, an over view of this technique is provided. In section 3, data
collection details inside a vehicle are provided. Section 4 provides the details
of ASR experiment and the results. In this section the recognition accuracy
results obtained using with and without convolutive under-determined BSS
Search WWH ::




Custom Search