Digital Signal Processing Reference
In-Depth Information
FIGURE 10.51. Diagram of the speech synthesis process.
Figure 10.51 is a diagram of the LPC speech synthesis. To reproduce the voice signal,
the following are required:
1. An excitation signal
2. The LPC filter coefficients
The excitation mechanism can be approximated using a residual signal generator
(for voiced signals) or a white Gaussian noise generator (for unvoiced signals) with
adjustable amplitudes and periods. The linear predictor P , a transversal filter with
p delays of one sample interval each, forms a weighed sum of past samples as the
input of the predictor. The output of the predictor at the n th sampling instant is
given by
p
Â
( +
s
=
a
s
d
n
k
m
n
k
1
where m
=
n
-
k and
d n represents the n th excitation sample.
Implementation
The input to the program is a sampled array of input speech using an 8-kHz sam-
pling rate. The samples are stored in a header file. The length of the input speech
array is 10,000 samples, translating into approximately 1.25 seconds of speech. The
input array is segmented into a large number of frames, each 80 B long with an
overlap of 40 B for each frame. Each frame is then passed to the following modules:
windowing, autocorrelation, LPC, residual, IIR, and accumulate. External memory
is utilized. A block diagram of the LPC speech synthesis algorithm with the various
modules is shown in Figure 10.52.
1. Segmentation . This module separates the input voice into overlapping seg-
ments. The length of the segment is such that the speech segment appears
stationary as well as quasi-periodic. The overlap provides a smooth transition
between consecutive speech frames.
Search WWH ::




Custom Search