Digital Signal Processing Reference
In-Depth Information
Table 11.17
Top-ranked chord uni- and bigrams in the LM by frequency of occurrence
Rank
1-gram
#
2-gram
#
1
G
244 820
D-G
57 500
2
D
227 549
G-G
55 106
3
A
198 958
C-G
54 702
4
C
188 194
G-C
54 040
5
E
130 896
A-D
46 162
6
F
87 741
D-A
43 534
7
B
72 360
G-G
41 090
8
Am
58 929
A-A
40 161
9
Em
57 537
D-D
39 710
10
A#
32 583
E-A
36 659
Table 11.18
WA for the ChoRD corpus, LOSO evaluation
WA[%]
Correlation
SVM
HMM
HMM
+
LM
24 major / minor
39.41
40.24
58.57
60.13
36 major / minor / other
28.37
36.71
45.39
48.84
'Other' chords cover augmented, diminished, power, and sustained chords
As alternative data-driven processing methods, we compare SVMs to HMMs with
and without the language model. A linear kernel, pairwise multi-class discrimination,
and SMO learning proved as best choice for SVMs. For HMMs, one continuous
model with one emitting state per beat was used. The models were trained with 20
Baum-Welch iterations [ 133 ]. A single Gaussian mixture component was the best
choice. To enable Viterbi search for decoding, a 'word-loop' context free grammar
modelled the chord sequence in the case where no data-driven language model was
used. On the other hand, when the language model is enabled (HMM + LM), Laplace
smoothed class-based Katz back-off-bigrams with a cutoff of one were found as best
configuration.
11.5.3 Performance
A song-independent cyclic 'leave-one-song-out' (LOSO) training and testing was
chosen as evaluation strategy under realistic conditions. Table 11.18 depicts observed
WA for the different data-free and data-learnt chord determination strategies.
One notes that with increasing data inclusion on the AM and LM level and context
modelling, the WA is increased. By that, HMM exceed SVM as they allow for
contextual modelling. The mapping to and by that reduction to major and minor
chords leads to higher WA despite still handling 'any input', if this appropriate in
the context of the application.
 
 
Search WWH ::




Custom Search