Digital Signal Processing Reference
In-Depth Information
spectrogram of an input .wav file, shows a representative sample section of a piece
of music featuring a live drum, a voice, and other instruments. The beat pattern is
visible in the spectrogram of the music, and the energy plot shows that the beat of
the drum can be the most energy-rich portion of the music.
Furthermore, it is advantageous to filter out any higher-frequency portions of the
music that may also have high energy. This has the added advantage that the parts
of the music containing no bass line will not “confuse” the algorithm.
Implementation
Figure 10.6 shows the partial C source program beatdetector.c that can be
completed readily. The project can be tested first using the executable (.out) file on
the CD in the folder beatdetector . The incoming music signal is continuously
sampled at 8 kHz (with a 4-kHz antialiasing filter on the codec) and stored in a
buffer. The buffer has 4000 points and is decomposed into 20 chunks, each chunk
consisting of 200 points. The signal energy of a smaller portion of the buffer—a
“chunk” of the larger buffer—consisting of the most recently collected samples is
compared to the signal energy of the entire buffer. When this portion of the signal
has a significantly higher energy than the rest of the signal, it is considered to be a
beat. The average algorithm is described by the following equations:
N
1
Â
2
[]
E
=
Bk
N
k
=
0
i
+
n
1
0
Â
2
[]
e
=
Bk
n
ki
=
0
true
e
>◊
E
C
Ó
beat
=
false
otherwise
represent the average energy of the buffer and of each chunk, respec-
tively. C is the comparison factor (sensitivity), B is the buffer, and i 0 is the start posi-
tion in the chunk buffer. N and n represent the number of points in the buffer and
in the chunk, respectively. The first two equations represent the average for the
entire buffer and for a chunk, respectively, and the third equation describes the
actual beat detection logic.
To fine-tune this method, the following can be adjusted: (1) the length N of the
larger buffer (the total signal being compared against), (2) the length n of the chunks
(the “instantaneous” signal), and (3) the sensitivity C of the energy comparison.
Values for C ranging from 0.5 to 2 were tested, and a value of 1.3 seems to be optimal
for most types of music.
A larger buffer size can give a better energy average; however, this has several
drawbacks:
·
E
Ò
and
·
e
Ò
1. A larger chunk size means lower accuracy since the beat status can only be
updated as often as a single chunk is filled and processed.
Search WWH ::




Custom Search