Digital Signal Processing Reference
In-Depth Information
x
x )
dx
C
(
x
) =
p
(
−∞
F
(
x
)
x ))
G
( ˜
x
)
x
=
p
(
G
( ˜
d
˜
(9.16)
x
˜
−∞
F
(
x
)
x )
x
=
p
˜
( ˜
d
˜
−∞
= C
(
F
(
x
))
By that, the transformation converting the distribution p
(
x
)
into the 'target' dis-
tribution
p
˜
( ˜
x
) =
p ref ( ˜
x
)
can be expressed as
C 1
C 1
x
˜
=
F
(
x
) =
[
C
(
x
) ]=
ref [
C
(
x
) ] ,
(9.17)
where C 1
is the inverse cumulative probability function of the reference distri-
bution [ 1 ]. Further, C
ref (...)
is the feature's cumulative probability function. To obtain
the transformation per feature vector component, a 'rule of thumb' is to use 500 uni-
form intervals between
(...)
μ i
σ i and
μ i +
σ i for the derivation of the histograms.
4
4
μ i and
σ i are the mean and standard deviation of the i th feature vector element.
A Gaussian probability distribution with zero mean and unity variance can be used
per element as a reference probability distribution, then, however, ignoring higher
moments.
From the feature normalisation strategies discussed above, CMS is the simplest.
Together with MVN, it is used most frequently. MVN usually leads to better results at
slightly increased computational effort. However, these two techniques both provide
a linear transformation. This is different for HEQ, which is able to compensate non-
linear effects, but requires sufficient audio frames for good results. HEQ further
corrects only monotonic transformations. This can cause an information loss, given
that random noise behaviour renders the needed transformation non-monotonic.
9.2.2 Model Based Feature Enhancement
In model based audio enhancement one usually models audio and noise individually
plus how these two produce the observation. Then, the features are enhanced to
benefit the audio of interest by use of these models. An example is a SLDM to
model the dynamics of clean audio of interest [ 13 ] that will next be introduced by
the mentioned three models for noise, audio, and the combination.
 
Search WWH ::




Custom Search