Digital Signal Processing Reference
In-Depth Information
The conventional beamformingmethods, such as the linearly constrained minimum
variance (LCMV) [ 1 ] and the generalized sidelobe canceller (GSC), can reduce
interference from undesired directions by exploiting the correlation among the
noise signals of different sensors [ 2 ]. However, the beamformer cannot avoid
suffering from high computational burden when the adaptive filter must be long
enough to effectively suppress the noise. Hence, this aspect is not favorable for the
system to be embedded on vehicular communication devices.
To solve this problem, we propose a novel algorithm which is based on spectral
magnitude modification using the structure of the generalized sidelobe canceller.
The envisioned algorithm applies an auditory filterbank on the primary signal, output
of the fixed beamformer, and the noise reference signal, output of the blocking matrix,
in order to estimate the spectral samples of noise components. Then, these samples are
fed to the gain filter for spectral modification so that the optimal spectral envelope of
the desired signal can be obtained. This structure provides unique advantages over
traditional beamforming methods including improvement of the perceptual quality of
speech, robustness against the stationary ambient noise, and high computational
efficiency. We develop the envisioned algorithm on the basis of a dual-microphone
array structure. In order to obtain the improved performance, we consider the optimal
combination using conventional adaptive noise cancellation which is executed in
general short-time Fourier transform domain.
12.2 Dual-Channel Speech Enhancement
12.2.1 Transfer Function Generalized Sidelobe
Canceller (TFGSC)
The basic GSC structure is composed of a fixed beamformer (FBF), a blocking
matrix (BM), and a noise canceller filter (NC). The FBF forms a beam in the look
direction so that the acoustic signal from the desired speaker is passed while
interfering noises are suppressed. Then, the BM blocks the desired signal and
produces a noise reference signal. The NC generates a replica of the component
which is included in the FBF output and is correlated with the interference.
An enhanced speech signal is obtained by subtracting the replica from the output
of the FBF. Conventionally, these processes are often described in terms of sampled
data representation. The broadband GSC expression, which is based on general
transfer functions (TF) of room impulse responses (RIR), has recently been
introduced [ 3 ]. Compared with the simple attenuation-and-delay assumption on
RIRs, the TF-based BM forms a sharp null in the look direction so that the leakage
signal of desired speech is more favorably attenuated. Ideally, the BMwould convey
a pure noise reference input to the NC. Moreover, use of TFs in the FBF provides
the ability to keep the desired signal free from distortion in a highly reverberant
room condition. Gannot et al. developed this concept based on the transfer function
ratio (TFR) and constructed an adaptive GSC, so-called TFGSC [ 2 ]. In Fig. 12.1 ,
Search WWH ::




Custom Search