Digital Signal Processing Reference
In-Depth Information
standard. In this section, we will focus on the au and wav formats. Typically,
a digital audio file stored in the au format has an .au extension, while digital
audio stored in the wav format has a .wav extension.
M ATLAB provides a number of library functions to read and write audio
files stored in the au and wav formats. For the au format, M ATLAB provides the
auread and auwrite functions to read and write an audio file, respectively.
Likewise, the wavread and wavwrite functions are available to read and
write an audio file in the wav format. The following code reads the audio file
testaudio1.wav ” using the wavread function. There are three output
arguments to the wavread function. The first argument x is an array where
the audio signal is restored. For mono (single-channel) audio signals, x is a
1D vector. For stereo (dual-channel) signals, x is a 2D array corresponding to
the number of signals played by the two speakers. The second argument Fs
represents the sampling rate, while nbit represents the number of bits per
sample.
>> %Reading the input audio file
>> infile = 'f: \ M ATLAB \ signal \
% audio file
>> testaudio1.wav';
>> [x, Fs, nbit] = wavread(infile); % x = signal
% Fs = sampling rate
% nbit = number of
% bits per sample
The above M ATLAB program will produce a 1D array x with dimension
26 079 1. In other words, the audio signal is a mono signal and contains 26 079
samples. The sampling rate is 22.05 ksamples/s and the signal is quantized using
an 8-bit quantizer. The waveform of the audio signal stored in the testaudio1.wav
file is shown in Fig. 17.4. To play the audio signal stored in x , we use the sound
or soundsc function available in M ATLAB as follows:
>> sound(x,Fs);
The soundsc function normalizes the entries of vector x so that the sound is
played as loud as possible without clipping. The mean value is also removed.
After playing the vector x obtained from testaudio1.wav , you should
recognize that the file contains the spoken word “audio.” Relating the word
“audio” to Fig. 17.4, we observe that the waveform has three distinct segments.
The first segment represents the syllable “au,” the second segment represents
the syllable “di,” and the last segment represents “o.” Some silent intervals,
represented by near-zero-amplitude waveforms, are also observed in the plot.
17.2.3 Spectral analysis of speech signals
In Section 17.1, we presented techniques for estimating the spectral content
of a nonstationary signal. Audio signals such as speech, music, and ambient
Search WWH ::




Custom Search