Databases Reference
In-Depth Information
Compute
Identify
start
state
Quantize
Start
residuals
Encode
remaining
High-
pass
Generate
Packetize
LPC
residual
coeff.
filter
residuals
F I GU R E 18 . 10
Steps in the encoding of speech for the Internet Low Bandwidth
Coder.
This effectively gives us a 20-bit fixed codebook. This corresponds to a codebook size of 2 20 .
However, we do not need to store the codebook. By assigning more or fewer pulses per track
we can dramatically change the “size” of the codebook and get different coding rates. The
standard details a rapid search procedure to obtain the excitation vectors.
The voice activity detector allows the encoder to significantly reduce the rate during pe-
riods of speech pauses. During these periods the background noise is coded at a low rate by
transmitting parameters describing the noise. This comfort noise is synthesized at the decoder.
18.5 Coding of Speech for Internet Applications
The Internet presents a very different set of problems and opportunities for speech compression.
Data transmitted over the Internet has to be packetized. This can mean that the information
used to encode speech contained in a packet has to come from within the packet. This in turn
results in much higher bit rates than we have considered in previous sections. Consecutive
packets can make their way along different routes from the encoder to the decoder. If the
delay between packets is sufficiently long the delayed packet has to be considered lost and the
decoder has to come up with strategies to ameliorate the effects of this loss. On the positive
side the processing power available for the encoding and decoding can be significantly higher
if the codec is implemented on computers rather than telephones (though the distinction may
sometimes be academic).
In this sectionwe will look at three different speech coding standards developed specifically
for the Internet environment, the Internet LowBitrateCodec (iLBC), the ITU-TG.729 standard,
and SILK, the coder used by Skype.
18.5.1 iLBC
The iLBC was first proposed by Global IP Sound (later Global IP Solutions, which was bought
by Google in 2011), and is used in a number of Voice over Internet Protocol (VoIP) applications
including Google Talk, Skype, and Yahoo Messenger. The coder was standardized by the
Internet Engineering Task Force in RFC 3951 [ 243 ] and RFC 3952 [ 244 ]. The iLBC allows
speech coding of 8000 samples per second speech at two fixed rates, 15.2 kbits per second and
13.33 kbits per second, where the input is quantized using a 16-bit uniform quantizer. A block
diagram of the encoding process is shown in Figure 18.10 .
After an optional high-pass filter with a cutoff frequency of 90 Hz to remove low-frequency
noise, such as a 60-cycle hum or a DC bias, the data are divided into blocks that are encoded
independent of each other. This independence helps ameliorate the effects of packet loss.
For the higher rate option the data are blocked into 20 millisecond blocks corresponding to
 
Search WWH ::




Custom Search