Information Technology Reference
In-Depth Information
spaced. This is by design. To achieve a more accurate numeric value (and thus, a more ac-
curate reconstructed signal at the other end), the frequencies more common to voice are
tightly packed with numeric values, whereas the “fringe frequencies” on the high and low
end of the spectrum are more spaced apart.
The sampling device breaks the 8 binary bits in each byte into two components: a posi-
tive/negative indicator and the numeric representation. As shown in Figure 1-13, the first
bit indicates positive or negative, and the remaining seven bits represent the actual nu-
meric value.
1
0
1
1
0
1
0
0
Figure 1-13
Encoding Voice into Binary Values
Because the first bit in Figure 1-13 is a 1, you read the number as positive. The remaining
seven bits represent the number 52. This is the digital value used for one voice sample.
Now, remember, the Nyquist theorem dictates that you need to take 8,000 of those sam-
ples every single second. Doing the math, figure 8,000 samples a second times the 8 bits in
each sample, and you get 64,000 bits per second. It's no coincidence that uncompressed
audio (including the G.711 audio codec) consumes 64 kbps. Once the sampling device as-
signs numeric values to all these analog signals, a router can place them into a packet and
send them across a network.
Note: There are two forms of the G.711 codec: μ-law (used primary in the United States
and Japan) and a-law (used everywhere else). The quantization method described in the
preceding paragraph represents G.711 a-law. G.711 μ-law codes in exactly the opposite
way.Ifyouweretotakeallthe1bitsinFigure1-13andmakethem0sandtakeallthe0bits
and make them 1s, you would have the G.711 μ-law equivalent. Yes, it doesn't make sense
to code it that way, but who said things we do in the United States should make sense?
The last and optional step in the digitization process is to apply compression measures.
Advanced codecs, such as G.729, allow you to compress the number of samples sent and
thus use less bandwidth. This is possible because sampling human voice 8,000 times a sec-
ond produces many samples that are similar or identical. For example, say the word “cow”
out loud to yourself (provided you are in a relatively private area). That takes about a sec-
ond to say, right? If not, say it slower until it does. Now, listen to the sounds you are mak-
ing. There's the distinguished “k” sound that starts the word, then you have the “ahhhhhh”
sound in the middle, followed by the “wa” sound at the end. If you were to break that into
8,000 individual samples, chances are most of them would sound the same.
The process G.729 (and most other compressed codecs) uses to compress this audio is to
send a sound sample once and simply tell the remote device to continue playing that
 
 
Search WWH ::




Custom Search