Digital Signal Processing Reference
In-Depth Information
Steinway D from MAPS, and the Pianoforte from the Real World Computing Music
Database [ 56 ]. This resulted in the following general metrics:
F =
63
.
4%,
A =
46
7 %. In the second, the test piano was left out from training and
the templates were learned with the two other pianos. This resulted in the following
general metrics:
.
5%,
E tot =
60
.
1%.
This shows that the best results are obtained when only the test piano is used for
training, meaning that considering other pianos does not add useful information to
the system. When the test piano is not used for training, generalization is not perfect
yet the system with the algorithm BND is still competitive with the other off-line
systems. We also emphasize that in a real-time setup, the templates can in general
be learned from the corresponding piano.
To go further, we also submitted the system to MIREX 2010 where it was evaluated
and compared to other algorithms on different tasks of polyphonic music transcrip-
tion for various instruments and kinds of music. 4 The system we submitted was a
preliminary version of the algorithm BND with just piano templates in the dictionary
as described in [ 23 ], and was the only real-time system in competition. It performed
however comparably to the other systems, with the following general metrics at the
frame level for general music with various instruments:
F =
58
.
4%,
A =
41
.
2%,
E tot =
69
.
F =
.
A =
.
57
4%,
45
7%,
E tot =
.
7 %. Moreover, the system also finished second on seven systems for the
note level tasks of tracking in general music with various instruments and of tracking
in piano music.
84
14.6.2 Drum Transcription
For the problem of drum transcription, we considered two drum loops as sample
examples. The first one contains three instruments: kick, snare and hi-hat, and the
second one contains four instruments of a different drum kit: kick, snare, hi-hat and
tom.
The drum loops were both decomposed onto the same dictionary of four templates
representing a kick, a snare, a hi-hat and a tom. The templates were learned from
isolated samples of the second drum kit. This was done to assess the generalization
capacity of the system and algorithms on the first loop. Moreover, we added an
important background of recorded polyphonic music from a wind quintet to the
second loop in order to assess robustness issues as well. The two corresponding drum
loops are available on the companion website. The representation front-end used for
decomposition of the loops was the same as for polyphonic music transcription,
except that the sampling rate was set to 22,050 Hz to account for high-frequency
discriminative information in the hi-hat.
4 The results of the 2010 MIREX evaluation for multiple fundamental frequency estimation and
tracking are available on-line: http://www.music-ir.org/mirex/wiki/2010:Multiple_Fundamental_
Frequency_Estimation_%26_Tracking_Results .
Search WWH ::




Custom Search