Digital Signal Processing Reference
In-Depth Information
some instances of difficult pitch pulse detection along with the estimated
probabilities, p , and the threshold value. In Figures 9.6c and 9.6d, the
resonating speech waveforms are also shown.
The problem illustrated in Figure 9.6b can be explained in both the time
and frequency domains. In speech segments with a short pitch period, the
short-term LPC prediction tends to remove some of the pitch correlation
as well, leaving an LPC residual without any clearly distinguishable peaks.
Shorter pitch periods in the time domain correspond to fewer harmonics in
the frequency domain. Hence the inter-harmonic spacing becomes wider and
the formants of the short-term predictor tend to coincide with some of the
harmonics (see Figure 9.7). The speech spectrum in Figure 9.7 is lowered by
80 dB in order to emphasize the coinciding points of the spectra. The excessive
removal of some of the harmonic components by the LPC filter disperses the
energy of the residual pitch pulses. It has been reported that large errors in
the linear prediction coefficients occur in the analysis of sounds with high
pitch frequencies [35]. In the case of nasal sounds, the speech waveform
has a very high low-frequency content (see Figure 9.6c). In such cases, the
LPC filter simply places a pole at the fundamental frequency. A pole in the
LPC synthesis filter translates to a zero in the inverse filter, giving rise to a
fairly random-looking LPC residual signal. The figures demonstrate that the
estimated probabilities, p exceed the threshold value only at the required
pitch pulse locations, despite those difficulties.
40
LPC spectrum
20
0
20
speech spectrum
40
60
0
1000
2000
3000
4000
frequency (Hz)
Figure 9.7 Speech and LPC spectra of a female vowel segment
Search WWH ::




Custom Search