Speech Enhancement - Digital Speech: Coding for Low Bit Rate Communication Systems

Digital Signal Processing Reference

In-Depth Information

in which α k and θ k are dummy variables for the spectral amplitude and phase,

respectively, of X k . The amplitude has the Rayleigh distribution given by,

exp

α k

2 α k

−

p(α k )

(11.29)

2 )

X k

and the phase has the uniform distribution given by,

2 π

p(θ k )

(11.30)

Through derivation given in [10], equation (11.26) can be rewritten as,

( 1 . 5 ) √ v k

γ k

exp

2 ( 1

v k )I 0 v k

v k I 1 v k

v k

| X k |=

−

Y k |

(11.31)

= √ π/ 2, I 0 (

where (

) denote

the modified Bessel functions of zero and first order, respectively, and

v k ≡

) is the gamma function with ( 1 . 5 )

) and I 1 (

ξ k

ξ k γ k .

As a variant, Ephraim and Malah [15] proposed an MMSE log spectral

amplitude (MMSE-LSA) estimator, based on the well-known fact that a

distortionmeasurewith the log spectral amplitudes ismore suitable for speech

processing. The MMSE-LSA estimator minimizes the following distortion

measure,

log

| X k

X k

|−

log

(11.32)

with

exp E

Y k }

| X k |=

{

log ( |

X k | ) |

(11.33)

From [15], the final estimate becomes,

ξ k exp 1

∞

e − t

ξ k

| X k |=

Y k |

(11.34)

v k

11.2.5 Spectral EstimationBasedontheUncertaintyofSpeech

Presence

The conventional speech enhancement methods can be extended by incorpo-

rating the uncertainty of speech presence [14, 15]. The absence and presence

of speech, H 0 and H 1 , respectively, can be defined as,

H 0 : Y k =

D k

(11.35)

H 1 : Y k =

X k +

D k

(11.36)

Digital Speech: Coding for Low Bit Rate Communication Systems

Search WWH ::

Custom Search

Home