Digital Signal Processing Reference
In-Depth Information
squared error values are logically combined to increase the reliability of the
AbS transition detection. The combinations and thresholds are determined
empirically by plotting the parameters with the corresponding speech wave-
forms. This heuristic approach is superior to a statistical approach, because it
allows inclusion of the most important transitions, while the less important
ones can be given a lower priority. AbS transition detection compares the
harmonically synthesized speech with the original speech, verifies the accu-
racy of the harmonic model parameters, and decides to use ACELP when the
harmonic model fails.
The cross-correlation and squared error values are estimated on the pitch
cycle basis in order to determine the suitability of the harmonic excitation for
each pitch cycle. Estimating the parameters over the complete synthesis frame
may average out a large error caused by a sudden transition. In Figure 9.23a,
the speech waveform has a minor transition. The estimated parameters
also indicate the presence of such a transition. These minor transitions are
synthesized using the harmonic excitation, and the mode is not changed
to waveform coding. Changing the mode for these small variations leads to
excessive switching, which may degrade the speech quality, when the bit-rate
of the waveform coder is relatively low, due to the quantization noise of the
waveform coding. Moreover, the harmonic excitation is capable of producing
good quality speech despite those small variations in the waveform. In
addition to maintaining the harmonic mode across those minor transitions,
in order to limit excessive switching, the harmonic mode is not selected after
ACELP when the speech energy is rapidly decreasing. Rapidly-decreasing
speech energy indicates an offset and at some offsets the coding mode may
fluctuate between ACELP and harmonic, if extra restrictions are not imposed.
At such offsets, the accumulated error in the LPC memories through the
harmonic mode is corrected by switching to the ACELP mode, which in turn
causes a switch back to the harmonic mode. The additional measures taken
to eliminate those fluctuations are described below.
In order to avoid mode fluctuations at the offsets, extra restrictions are
imposed when switching to the harmonic mode after waveform coding. The
rms energy of the speech and the LPC residual are computed for each frame,
and a hysteresis loop is added using a control flag. The flag is set to zero
when the speech or the LPC residual rms energy is less than 0.75 times the
corresponding rms energy values of the previous frame. The flag is set to one
when the speech or the LPC residual rms energy is more than 1.25 times the
corresponding rms energy values of the previous frame. The flag is set to zero if
the pitch is greater than 100 samples, regardless of the energy.When switching
to harmonic mode after waveform coding, the control flag should be one, in
addition to the mode decision of closed-loop transition detection. The flag is
checked only at a mode transition, once the harmonic mode is initialized, the
flag is ignored. This process avoids excessive switching at the offsets.
Search WWH ::




Custom Search