Databases Reference
In-Depth Information
to find the best phase match between the concealed packet and the received packet and the
excitation signal of the received packet is modified using an overlap-add procedure.
18.5.2 G.729
The ITU-TG.729 standardwas not developed specifically for the Internet. It provides relatively
good quality at 8 kbits per second [ 245 ], however, its relative complexity has made it a less
attractive option for standard telephone applications. In the Internet world this complexity is
not as severe a drawback as the encoding and decoding is performed on relatively powerful
computers, while the increase in quality afforded by the complexity is a definite plus. A block
diagram of the encoder is shown in Figure 18.11 . As in the previous techniques the encoder
needs to determine the synthesis filter coefficients and the excitation vector, which are then
used by the decoder to reconstruct the speech.
The various operations shown in the block diagram are performed either once per frame,
or once per subframe. A frame corresponds to 10 milliseconds of speech, which at a sampling
rate of 8000 samples per second corresponds to 80 samples. A subframe is a half frame made
up of 40 samples corresponding to 5 milliseconds of speech.
The preprocessing step involves downscaling the input by two to reduce the possibility of
overflow and high-pass filtering to remove low-frequency interference including DC bias and
interference from the power source. The filter has two poles and two zeros with a cutoff of
140 Hz.
The computation of the synthesis filter is relatively straightforward; the identification of
the excitation signal is somewhat more involved. The filtered signal is analyzed to obtain the
short-term analysis and synthesis filters. Once per frame the signal is windowed using a 30
millisecond window given by
0
46 cos 2 π k
399 k
.
54
0
.
=
0
,
1
,...,
199
cos 2 π( k 200 )
159
w lp (
k
) =
k
=
200
,
201
,...,
239
The autocorrelation coefficients are obtained from the windowed signal and scaled to re-
duce arithmetic problems using the same scaling as the iLBC encoder (Equation ( 35 )). The
Levinson-Durbin algorithm is then used to obtain the linear predictive coefficients, which are
then used to obtain the line spectral frequencies. A fourth order moving average predictor is
used to predict the LSF coefficients in the current frame. The prediction residual is quantized
using a two-stage vector quantizer. The first stage is a 10-dimensional VQ with a codebook of
size 128. The quantized error is then further quantized using the second stage, which consists
of two five-dimensional vector quantizers, one for the first five coefficients and one for the
second five coefficients. Each five-dimensional vector quantizer has a codebook size of 32.
Once the LSF coefficients are quantized they are adjusted in order to make sure that the order
of the frequencies is preserved and also to prevent two frequencies from being too close to
each other. The latter situation would lead to unwanted resonances in the synthesis filter. In
order to preserve continuity between frames the computed and quantized LSF values are used
for the second subframe. For the first subframe the standard requires using an interpolated set
of LSF values. The interpolation is performed on the LSP values q i . The LSP values of the
Search WWH ::




Custom Search