Harmonic Speech Coding - Digital Speech: Coding for Low Bit Rate Communication Systems - page 258

Digital Signal Processing Reference

In-Depth Information

can be approximated by using the integrals of the component frequencies.

Moreover, LPC models the large variation in the speech magnitude spectrum

and simplifies the harmonic amplitude quantization.

8.2 Sinusoidal Analysis and Synthesis

Figure 8.1 depicts block diagrams of the sinusoidal analysis and synthesis

processes introduced by McAulay. The speech spectrum is estimated by

windowing the input speech signal using a Hamming window and then

computing the Discrete Fourier Transform (DFT). The frequencies, ampli-

tudes, and phases corresponding to the peaks of the magnitude spectrum

become the model parameters of the sinusoidal representation. Employing a

pitch-adaptive analysis window length of two and a half times the average

pitch improves the accuracy of peak estimation. The synthesizer generates

the sine waves corresponding to the estimated frequencies and phases, and

modulates them using the amplitudes. Then all the sinusoids are summed

to produce the synthesized speech. The block edge effects are smoothed

out by applying overlap and add, using a triangular window. Overlap and

add is effectively a simple interpolation technique and, in sinusoidal synthe-

sis, it requires parameter update rates of at least every 10-15ms for good

quality speech synthesis. At lower frame rates the spectral peaks need to

be properly aligned between the analysis frames to form frequency tracks.

The amplitudes of the frequency tracks are linearly interpolated, and the

instantaneous phases are generated using a cubic polynomial [1] as shown in

Figure 8.2.

Amplitudes

Magnitude

spectrum

Peak

picking

Frequencies

Input speech

DFT

P hases

Phases of the

spectral peaks

Window

Sinusoidal speech analysis

Freq uencies

Synthetic speech

Sine wave

generator

Sum all

sine waves

Overlap and

add

Phases

Amplitudes

Sinusoidal speech synthesis

Figure 8.1 General sinusoidal analysis and synthesis

Next Page

Digital Speech: Coding for Low Bit Rate Communication Systems

Search WWH ::

Custom Search

Home