Melodic Query Input for Music Information Retrieval Systems - Intelligent Music Information Systems: Tools and Methodologies

Information Technology Reference

In-Depth Information

testing was performed on a 598-song database

with 37 query samples.

In our own work, we created and tested many

existing and new music representations which

we incorporated into various algorithms utiliz-

ing approximate matching techniques along the

lines of the systems described here and below.

As with our fast searching efforts described in

the previous section, we made use of pitch data,

duration data, and INOT data in our work. Here

we describe only our most successful technique,

which is named REPRED for relative-pitch,

relative-duration. Further details of the most

successful algorithms we developed can be found

in Kline and Glinert (2003).

The key insight to REPRED came as a result of

our development of a tempo estimator for our input

query samples. Some of our algorithms made use

of this estimator in order to transform note dura-

tions in terms of beats. The resulting REPSCAD

(relative-pitch, scaled-duration) algorithm proved

to be fairly effective, but analysis of our input

data set through our tempo estimator revealed a

pattern in INOT values. While we already knew

anecdotally that subjects typically compress long

notes when humming or singing, we found that

the drop-off could be predicted fairly well with

a logarithmic transformation.

The resulting REPRED algorithm represents

note durations in this way: for a given note x, its

duration component is given by taking the ratio

of the INOT value for note x over the INOT value

of the note following it, then taking the logarithm

of the result, that is, log 2 ( INOT x / INOT x+1 ). The

interval distance in semitones represents pitch

values. REPRED then uses these two values

within the approximate matching algorithm by

means of a scaled linear combination.

We found that the REPRED algorithm suc-

cessfully identified the target song within the

top ten results of a 3,600-song database search in

67% of our trials. When removing from consid-

eration the 3 subjects whose performances were

consistently below par (one of which was from

each of our three skill groups), success climbed

to 78% of all trials.

A few other recent publications have suggested

similar ideas to this. The matching algorithm

described in the CubyHum MIR system by Pauws

(2002) also incorporates a duration ratio as part

of his work but does not involve log scaling and

is used in a different manner. A more recent

system by Unal, Narayanan and Chew (2004)

similarly incorporates a duration ratio into their

input representation. The MPEG-7 Melody

Sequence description scheme, which was being

developed independently from us as our work was

being completed, uses exactly the same method

as our REPRED implementation to encode and

represent note duration information (Gomez et

al., 2003, p. 3).

other systems and techniques

As part of our testing of REPRED, we submitted

to the MELDEX system a small subset of our

input test queries as digitized audio (.wav) files,

each containing at least twelve hummed notes,

which were known to be in their database. For

more than half of our tests, the system reported

no matches whatsoever; others showed a list of

results which did not contain the correct song;

and in just one case did it correctly identify the

song as the first title in the returned list of close

matches. We cannot draw any definite conclusions

from this informal test; it appears the most com-

mon problem was their transcriber missed some

of the hummed notes in our queries, but enough

remained that it seems the queries should have

returned some results, even if incorrect.

Kosugi et al. (2000) and Kosugi, Sakurai and

Morimoto (2004) produced an MIR system named

SoundCompass. Their original version utilized a

database of over 10,000 MIDI songs, while their

latest version has over 20,000. Their initial system

made use of Wildcat Canyon's Autoscore software

to handle the pitch transcription task, though they

have since made further improvements including

Intelligent Music Information Systems: Tools and Methodologies

Search WWH ::

Custom Search

Home