Digital Signal Processing Reference
In-Depth Information
model parameters of different modes may be quantized at different bit-rates,
allocating the minimum number of bits required for each mode to maintain
adequate quality.
In the example, here the LPC parameters are common for all the modes,
and quantized using a fixed number of bits per frame. This is advantageous
under noisy channel conditions, since the LPC parameters can be decoded
correctly even when the mode bits are in error. The LPC parameters are
quantized in the LSF domain using a multi-stage vector quantifier (MSVQ),
with a first order moving average (MA) prediction [37]. Having quantized
the LSFs, the excitation of the three modes are quantized differently.
9.9.2 UnvoicedExcitationQuantization
The hybrid coding algorithm synthesizes unvoiced speech using scaled white
Gaussian noise as the LPC excitation. Therefore, only a gain term is required
in addition to the LPC parameters to synthesize unvoiced speech. In order
to synthesize the unvoiced plosives with adequate quality, the gain term
should be updated at least every 5ms. However listening tests show that
synthesizing plosives using ACELP gives better perceptual quality. Therefore
the plosives are synthesized using ACELP. The energy of the fricatives does
not show rapid fluctuations and updating at the frame rate of every 20ms is
adequate to synthesize high-quality unvoiced fricatives.
The unvoiced gain g uv is quantized using a logarithmic scalar quantizer.
The quantized unvoiced gain g uv i
is given by,
k g max
i
+
k
N
1
g uv i =
k
for i
=
0 , 1 , 2 , ... ,N
1
(9.56)
k
where N is the number of quantizer levels, g max , defines the upper limit of g uv i ,
and k is a constant which controls the gradient of the exponential function.
All the g uv values larger than g max are clipped at g max . The constant k is set as
16 and 32 quantizer levels were sufficient to produce high quality unvoiced
speech. Hence five bits are required to transmit the quantized unvoiced gain,
g uv i . Figure 9.27 depicts a typical plot of the unvoiced gain quantizer levels
where the maximum g max =
904.
9.9.3 HarmonicExcitationQuantization
The stationary voiced speech segments are synthesized using the synchro-
nized harmonic excitation model described earlier. The model parameters
of the harmonic excitation with SWPM are pitch period, pitch pulse loca-
tion (PPL), pitch pulse shape (PPS), harmonic amplitudes, and gain. The
AbS transition detection algorithm synthesizes the harmonic excitation using
SWPM at the encoder to evaluate the suitability of the harmonic mode.
Search WWH ::




Custom Search