Audio Features - Intelligent Audio Analysis

Digital Signal Processing Reference

In-Depth Information

In the following, we will use z -transformation for the mathematical derivation. The

(two-sided) z -transformation is given by:

+∞

z − k

(

) =

(

)

(6.35)

=−∞

With the z -transformations E

(

)

and S

(

)

of the signals e

(

)

and s

(

)

, respectively,

z − i

and obeying the rule of the z -transformation that s

(

−

)

corresponds to S

(

)

the z -domain, holds:

a i z − i

(

) =

(

)(

(6.36)

and for the transfer function H

(

)

(

)

a i z − i

(

) =

(6.37)

(

)

In the inverse case the system is excited by the error signal and produces the speech

signal—the filter then is a mere recursive filter and the transfer function the reciprocal.

This is a simple model for speech production, where the vocal tract is seen as linear

filter which is excited by regular pulses by the vocal chords. The excitation pulses

are not linearly predictable at a low number of predictor coefficients within a short

analysis interval and thus produce the prediction error. In the case of unvoiced sounds,

excitation is given by white noise. The transfer function in this case has only poles

and no zeros, i.e., the system is an all-pole model [ 6 ]. These poles can be determined

directly from the predictor coefficients a i . One now has to determine these for a given

order p such that the deviation between the estimated signal and the real signal is

minimal.

The squared error

within the interval of analysis (for the moment running from

=−∞

+∞

; later within the open window region) is:

α =

(

)

(6.38)

α =

a i s

(

−

)

(6.39)

Note that, for simplification a coefficient a 0 was introduced that equals one. In order

to determine the minimum of this error, one differentiates the error partially per

predictor coefficient and sets the derived error equal to zero:

Intelligent Audio Analysis

Search WWH ::

Custom Search

Home