Game Development 101 - Beginning Android Games

Game Development Reference

In-Depth Information

Recording and Playback

The principle of recording and playing back audio is actually pretty simple in theory. For

recording, we keep track of the point in time when certain amounts of pressure were exerted on

an area in space by the molecules that form the sound waves. Playing back these data is a mere

matter of getting the air molecules surrounding the speaker to swing and move like they did

when we recorded them.

In practice, it is of course a little more complex. Audio is usually recorded in one of two ways: in

analog or digitally. In both cases, the sound waves are recorded with some sort of microphone,

which usually consists of a membrane that translates the pushing from the molecules to some

sort of signal. How this signal is processed and stored is what makes the difference between

analog and digital recording. We are working digitally, so let's just have a look at that case.

Recording audio digitally means that the state of the microphone membrane is measured and

stored at discrete time steps. Depending on the pushing by the surrounding molecules, the

membrane can be pushed inward or outward with regard to a neutral state. This process is

called sampling , as we take membrane state samples at discrete points in time. The number

of samples we take per time unit is called the sampling rate . Usually the time unit is given in

seconds, and the unit is called hertz (Hz). The more samples per second, the higher the quality

of the audio. CDs play back at a sampling rate of 44,100Hz, or 44.1KHz. Lower sampling rates

are found, for example, when transferring voice over the telephone line (8KHz is common in

this case).

The sampling rate is only one attribute responsible for a recording's quality. The way in which we

store each membrane state sample also plays a role, and it is also subject to digitalization. Let's

recall what the membrane state actually is: it's the distance of the membrane from its neutral

state. Because it makes a difference whether the membrane is pushed inward or outward,

we record the signed distance. Hence, the membrane state at a specific time step is a single

negative or positive number. We can store this signed number in a variety of ways: as a signed

8-, 16-, or 32-bit integer, as a 32-bit float, or even as a 64-bit float. Every data type has limited

precision. An 8-bit signed integer can store 127 positive and 128 negative distance values.

A 32-bit integer provides a lot more resolution. When stored as a float, the membrane state is

usually normalized to a range between −1 and 1. The maximum positive and minimum negative

values represent the farthest distance the membrane can have from its neutral state. The

membrane state is also called the amplitude . It represents the loudness of the sound that hits it.

With a single microphone, we can only record mono sound, which loses all spatial information.

With two microphones, we can measure sound at different locations in space, and thus get

so-called stereo sound . You might achieve stereo sound, for example, by placing one microphone

to the left and another to the right of an object emitting sound. When the sound is played back

simultaneously through two speakers, we can reasonably reproduce the spatial component of

the audio. But this also means that we need to store twice the number of samples when storing

stereo audio.

The playback is a simple matter in the end. Once we have our audio samples in digital form and

with a specific sampling rate and data type, we can throw those data at our audio processing

unit, which will transform the information into a signal for an attached speaker. The speaker

interprets this signal and translates it into the vibration of a membrane, which in turn will cause

the surrounding air molecules to move and produce sound waves. It's exactly what is done for

recording, only reversed!

Beginning Android Games

Search WWH ::

Custom Search

Home