Digital Signal Processing Reference
In-Depth Information
Fig. 5.2 Dimensional mood model development: multidimensional scaling of emotion-related tags
as by Russell ( left ) and Thayer's model with four mood clusters ( right )[ 14 ]
Fig. 5.3 Dimensional mood
model with five discrete values
for arousal and valence [ 14 ]
decided in favour of a large database where changes in mood during a song are
'averaged out' in the annotation process, i.e., assignment of the connotative mood
one would overall have on mind. In fact, this can be sufficient in many applications,
such as for automatic music suggestion by the mood that best fits a listener's mood.
A different question is whether a learning model would benefit from a 'cleaner' repre-
sentation without change of mood over the length of a musical piece. For NTWICM,
one can assume the contained mainstream popular and commercially oriented music
to be less affected by such variation as might be found, e.g., in longer arrangements
of classical music. In fact, an analogon can be found in human emotion recognition:
Up to less than half of the duration of a spoken utterance may portray the perceived
emotion when annotated on isolated word level [ 30 ]. Yet, state-of-the-art emotion
recognition from speech usually ignores this fact by using turn-level labels rather
than word-level based labels [ 31 ].
Search WWH ::




Custom Search