Applications in Intelligent Music Analysis - Intelligent Audio Analysis

Digital Signal Processing Reference

In-Depth Information

The main approaches by template correlation-based and data-driven modelling were

opposed and evaluated on novel feature types. The data-driven model prevailed at a

maximum of 77.3 % WA for 12 keys for the whole dataset. For correct recognition of

six out of seven scale semitones, 94.2 % WA were reached. For individual datasets,

the correlation approach partly showed better results, but SVMs were superior given

sufficient data due to the ability to better cope with diversity: Perceptual studies of

tonal hierarchies show genre and task dependency according to [ 123 ]. In the case

of 24 keys the difference between these two approaches was amplified from 5.0 and

6.7 % absolute difference in WA. 62.1 % was the maximum WA for the correct key

and 84.9 % WA for six out of seven notes.

As for parametrisation, an optimum has been found for adapting reference pitch

classes to compensate for tape speed variation, using Gaussian filters for semitone

filtering, analysing the whole piece, and using the frequency band from C3 to C8 or

130.8 to 4 186 Hz, respectively, for feature computation. The proposed feature types

based on music theory and human perception were able to improve both approaches

for key assignment.

Future design of features for key determination could consider non-CHROMA

types such as bags of chords. In addition, further music theoretic or cognition inspired

approaches, e.g., inspired by [ 124 ] could be targeted. For the acoustic features, the

time-frequency representation could be improved, e.g., by wavelets [ 71 , 125 ]or

multi-resolution FFT. If one targets the mode instead of the 'absolute' key [ 126 ],

hierarchical schemes could be established. Non-tonal music audio could be modelled

as an additional class to cope with arbitrary music input [ 127 ]. Also, alternative minor

scales apart from the considered natural relative minor scale can be added. In [ 128 ],

PTR is given for harmonic and melodic minor scales which could be implemented

directly in the presented approach.

Extending to pieces with changing key can be achieved based on local analysis

[ 129 ]. Chunking for such local analysis could be based on beat and on-beat detection

[ 6 , 23 ] as presented in the previous two sections. Further, temporal context can then

be integrated by the use of LSTM networks [ 23 ]. Further, the novel features could be

used in related tonal analysis tasks [ 10 ], use key analysis to improve music structure

analysis [ 30 , 130 ], or exploit synergies by parallel key and progression analysis [ 131 ]

or similar mutually dependent information [ 99 ]. Finally, the results demonstrate the

complexity of key determination, and confidence measures and key hierarchies can

be useful considerations for application in real-life systems.

11.5 Chords

A more fine-granular description beyond the musical key is provided by the chord

progression in music. In the following, the method as presented in [ 10 ] and [ 29 ]is

explained and benchmark results are presented.

Search WWH ::

Custom Search

Home