Multimode Speech Coding - Digital Speech: Coding for Low Bit Rate Communication Systems

Digital Signal Processing Reference

In-Depth Information

2

4

cross-correlation

squared error

2

cross-correlation

0

squared error

0

− 2

original speech

−

4

original speech

− 6

− 4

8

synthesized speech

−

synthesized speech

− 6

− 10

0

200

400

600

0

200

400

600

samples

(a) Stationary voiced speech

(b) Transitory speech

1

3

cross-correlation

squared error

2

squared error

LPC residual

0

1

cross-correlation

0

− 1

LPC residual

− 1

LPC excitation

− 2

0

100

200

300

0

100

200

300

samples

(c) Stationary voiced LPC residual

(d) Transitory LPC residual

Figure 9.23 Squared error, E i , E i r , and cross-correlation, R i , R i r ,values

In order to estimate the normalized residual cross-correlation, R i r ,and

residual squared error, E i r , equations (9.52) and (9.53) are repeated with s(n)

and

r(n) respectively. Figure 9.23 depicts E i , R i ,

original speech s(n) , and synthesized speech

ˆ

s(n) replaced by r(n) and

ˆ

s(n) . E i and R i are aligned with

the corresponding pitch cycles of the speech waveforms, and the speech

waveforms are shifted down for clarity. Examples of the residual domain

signals, LPC residual r(n) , LPC excitation

ˆ

r(n) , E i r ,and R i r are also shown in

ˆ

the figure.

For stationary voiced speech, the squared error, E i , is usually much lower

than unity and the normalized cross-correlation, R i , is close to unity. How-

ever, the harmonic model fails at the transitions, which results in larger errors

and lower correlation values. The estimated normalized cross-correlation and

Digital Speech: Coding for Low Bit Rate Communication Systems

Search WWH ::

Custom Search

Home