Game Development Reference
In-Depth Information
Audio Quality and Compression
Wow, lots of theory. Why do we care? If you paid attention, you can now tell whether an audio
file is of high quality or not depending on the sampling rate and the data type used to store each
sample. The higher the sampling rate and the higher the data type precision, the better the quality
of the audio. However, that also means that we need more storage room for our audio signal.
Imagine that we record the same sound with a length of 60 seconds, but we record it twice:
once at a sampling rate of 8KHz at 8 bits per sample, and once at a sampling rate of 44KHz
at 16-bit precision. How much memory would we need to store each sound? In the first case,
we need 1 byte per sample. Multiply this by the sampling rate of 8,000Hz, and we need 8,000
bytes per second. For our full 60 seconds of audio recording, that's 480,000 bytes, or roughly
half a megabyte (MB). Our higher-quality recording needs quite a bit more memory: 2 bytes per
sample, and 2 times 44,000 bytes per second. That's 88,000 bytes per second. Multiply this
by 60 seconds, and we arrive at 5,280,000 bytes, or a little over 5MB. Your usual 3-minute pop
song would take up over 15MB at that quality, and that's only a mono recording. For a stereo
recording, you'd need twice that amount of memory. Quite a lot of bytes for a silly song!
Many smart people have come up with ways to reduce the number of bytes needed for an
audio recording. They've invented rather complex psychoacoustic compression algorithms
that analyze an uncompressed audio recording and output a smaller, compressed version. The
compression is usually lossy , meaning that some minor parts of the original audio are omitted.
When you play back MP3s or OGGs, you are actually listening to compressed lossy audio. So,
using formats such as MP3 or OGG will help us reduce the amount of space needed to store our
audio on disk.
What about playing back the audio from compressed files? While dedicated decoding hardware
exists for various compressed audio formats, common audio hardware can often only cope with
uncompressed samples. Before actually feeding the audio card with samples, we have to first
read them in and decompress them. We can do this once and store all of the uncompressed
audio samples in memory, or only stream in partitions from the audio file as needed.
In Practice
You have seen that even 3-minute songs can take up a lot of memory. When we play back our
game's music, we will therefore stream the audio samples in on the fly instead of preloading all
audio samples to memory. Usually, we only have a single music stream playing, so we only have
to access the disk once.
For short sound effects, such as explosions or gunshots, the situation is a little different. We
often want to play a sound effect multiple times simultaneously. Streaming the audio samples
from disk for each instance of the sound effect is not a good idea. We are lucky, though, as short
sounds do not take up a lot of memory. We will therefore read all samples of a sound effect into
memory, from where we can directly and simultaneously play them back.
We have the following requirements:
ï?®
We need a way to load audio files for streaming playback and for playback
from memory.
ï?®
We need a way to control the playback of streamed audio.
ï?®
We need a way to control the playback of fully loaded audio.
 
Search WWH ::




Custom Search