Digital Signal Processing Reference
In-Depth Information
compensation algorithm to decompose audio of interest and noise is introduced in
[ 35 ]. To enhance noisy MFCCs, a SLDM can also be used to model the dynamics of
audio of interest and those of additive noise by a LDM [ 13 ]. An observation model
then describes how audio and noise produce the noisy observations to reconstruct
the features of clean audio. An extension [ 36 ] includes time-dependencies among
the discrete state variables of the SLDM. Further, a state model for the dynamics
of noise can help to model non-stationary noise sources [ 37 ]. Finally, incremental
on-line adaptation of the feature space is possible as by feature space maximum
likelihood linear regression (FMLLR) [ 38 ]. Again, we will now take a detailed look
at selected popular approaches.
9.2.1 Feature Normalisation
9.2.1.1 Cepstral Mean Subtraction
To ease the influence of noise and transmission channel transfer functions in cepstral
features, CMS [ 11 , 39 ] provides a simple approach. Its basic principle of mean
subtraction can also be applied to practically any other audio LLD. Often, the noise
can be considered as comparably stationary when opposed to the rapidly changing
characteristics of the audio signal of interest. Thus, a subtraction is carried out of the
long-term average cepstral or other feature vector
T
1
T
μ =
x t
(9.10)
t
=
1
from the observed noise corrupted feature vector sequence of length T :
X
={
x 1 ,
x 2 ,...,
x t ,...,
x T }
(9.11)
By that, a new estimate
x t of the signal in the feature domain results:
˜
x t =
˜
x t μ,
1
t
T
(9.12)
The subtraction of the long-term average is particularly interesting in the cepstral
domain. Since the audio spectrum is multiplied by the channel transfer function (cf.
Sect. 6.2.1.4 ) , by the logarithm application in the MFCC calculation, this multipli-
cation turns into an addition, and this part can be eliminated by subtraction of the
cepstral mean from all input vectors. A disadvantage of CMS, as opposed to HEQ,
is the disability to treat non-linear noise effects.
 
 
Search WWH ::




Custom Search