Digital Signal Processing Reference
In-Depth Information
Table 14.1
Results of the transcription evaluation per algorithm
Algorithm
P
R
F
A
E subs
E miss
E fals
E tot
51
.
4
63
.
3
56
.
7
39
.
6
16
.
9
19
.
8
42
.
9
79
.
6
END
52
.
8
61
.
6
56
.
8
39
.
7
16
.
5
21
.
9
38
.
5
77
.
0
SPND
68
.
1
65
.
9
67.0
50.3
8
.
5
25
.
6
22
.
4
56.5
BND
[ 17 ]
61
.
0
66
.
7
63
.
7
46
.
8
10
.
4
22
.
9
32
.
3
65
.
6
[ 55 ]
60
.
0
70
.
8
65.0
48.1
16
.
3
12
.
8
30
.
8
60.0
The system was evaluated with the following algorithms and parameters tuned
manually to optimize results over the database: END , SPND with
λ 1
=
100, BND
with
5. The decompositions were respectively about 10 times, 10 times and
5 times faster than real-time under MATLAB on a 2
β =
0
.
00 Go of
RAM. We also notice that the evaluation for the algorithm SCND is not included since
it did not improve results compared to END or SPND , and it was computationally too
expensive to run in real-time.
The activation coefficients output by the algorithms were all post-processed with
the same transcription threshold set manually to 0
.
40 GHz laptop with 4
.
02. We did not use any further
post-processing so as to really compare the quality of the observations output by
the different algorithms at the frame level. For complementary information, we dis-
cuss the use of further post-processing in [ 23 ] where minimum-duration pruning is
employed for smoothing the observations at the note level.
To compare results, we also performed the evaluation for two off-line systems at
the state-of-the-art: one based on beta NMF with an harmonic model and spectral
smoothness [ 17 ], and another one based on a sinusoidal analysis with a candidate
selection exploiting spectral features [ 55 ].
We report the evaluation results per algorithm in Table 14.1 . Standard evalu-
ation metrics from the MIREX are used as defined in [ 53 ]: precision
.
P
, recall
R
,
F -measure
F
, accuracy
A
, total error
E tot , substitution error
E subs , missed error
E miss ,
false alarm error
E fals . All scores are given in percents.
Overall, the results show that the proposed real-time system and algorithms per-
form comparably to the state-of-the-art off-line algorithms of [ 17 ] and [ 55 ]. The
algorithm BND even outperforms the other approaches for all metrics. Sparsity con-
trol in SPND improves the economy in the usage of note templates for reconstructing
the music signal, resulting in general to a smaller recall but a greater precision com-
pared to END . In other terms, more notes are missed but this is compensated by
the reduction of note insertions and substitutions. As a result, there is no noticeable
global improvement with sparsity control on the general transcription in terms of
F -measure, accuracy and total error. This is in contrast with the benefits brought by
the flexible control on the energy-dependent frequency compromise in the decom-
position for the algorithm BND .
To assess the generalization capacity of the system, we focused on the algorithm
BND and performed two other evaluations. In the first, the templates were learned
as above but with three pianos: the Yamaha Disklavier Mark III from MAPS, the
 
Search WWH ::




Custom Search