Digital Signal Processing Reference
In-Depth Information
2
4
cross-correlation
squared error
2
cross-correlation
0
squared error
0
2
2
original speech
4
original speech
6
4
8
synthesized speech
synthesized speech
6
10
0
200
400
600
0
200
400
600
samples
samples
(a) Stationary voiced speech
(b) Transitory speech
1
3
cross-correlation
squared error
2
squared error
LPC residual
0
1
cross-correlation
0
1
LPC residual
1
LPC excitation
LPC excitation
2
2
0
100
200
300
0
100
200
300
samples
samples
(c) Stationary voiced LPC residual
(d) Transitory LPC residual
Figure 9.23 Squared error, E i , E i r , and cross-correlation, R i , R i r ,values
In order to estimate the normalized residual cross-correlation, R i r ,and
residual squared error, E i r , equations (9.52) and (9.53) are repeated with s(n)
and
r(n) respectively. Figure 9.23 depicts E i , R i ,
original speech s(n) , and synthesized speech
ˆ
s(n) replaced by r(n) and
ˆ
s(n) . E i and R i are aligned with
the corresponding pitch cycles of the speech waveforms, and the speech
waveforms are shifted down for clarity. Examples of the residual domain
signals, LPC residual r(n) , LPC excitation
ˆ
r(n) , E i r ,and R i r are also shown in
ˆ
the figure.
For stationary voiced speech, the squared error, E i , is usually much lower
than unity and the normalized cross-correlation, R i , is close to unity. How-
ever, the harmonic model fails at the transitions, which results in larger errors
and lower correlation values. The estimated normalized cross-correlation and
Search WWH ::




Custom Search