Melodic Query Input for Music Information Retrieval Systems - Intelligent Music Information Systems: Tools and Methodologies

Information Technology Reference

In-Depth Information

replacing this with their own transcriber (Kosugi,

personal correspondence, 2001). To our knowl-

edge, this was the first published system designed

to make significant use of duration information

as an integral part of its matching algorithms by

requiring users to select a metronome tempo before

humming and creating beat-based representations

of the input query note durations. Once the user

input was recorded and encoded in this fashion,

it was processed to create a series of feature vec-

tors used for search and matching. Their newest

SoundCompass version eliminates the need for

the metronome. In 2002, Kosugi kindly shared

with us 16 of his group's input query test samples

and their system's matching results so we could

compare the performance of REPRED. We found

that REPRED correctly identified every sample as

the highest-ranking match, even the seven which

gave SoundCompass difficulty. However, the dis-

parity in database size and composition prevents

more definite conclusions from this small test.

We have not compared the more recent version

of the system, which utilized more feature extrac-

tion, incorporated INOT values rather than raw

durations, and made other improvements such as

multithreaded parallel searches.

We also were able to compare the performance

of REPRED to the SuperMBox system from Jang

et al. (2001a). They implemented their own pitch

transcription component, which made use of some

additional heuristics in an attempt to smooth

out some of the pitch tracking errors we have

described; they also eliminated the requirement

of having a consonant stopping sound between

successive notes, allowing continuous humming

or even singing with words. (However, they

did not report on the relative accuracy of their

transcription process.) They assumed the tempo

of the user's input is consistent, and they take

advantage of this assumption by utilizing linear

scaling on the resulting query representation to

manipulate the effective tempo, creating several

time-stretched copies searched in parallel through

k-means clustering and a branch-and-bound tree

search on the resulting pitch vectors in order to

identify and rank the closest matches to a hummed

query. A follow-up system named MIRACLE

(Jang, Lee & Kao, 2001b) enabled searches to run

in parallel on several computers and a front-stage

fast algorithm to prune the size of the database, the

remainder of which is then searched by the more

complex algorithm. We tested our input samples

against the downloadable copy of SuperMBox

available at Jang's personal Web site (Jang et

al., n.d.). With databases of comparable size but

largely different tunes, we found SuperMBox

performed about as well as REPRED when it

was constrained to match only against the start

of songs; in its match-anywhere mode, it did not

perform as well. Again, the small size of the test

and the differences in database content preclude

a formal performance comparison.

summarY and conclusIon

MIR systems which allow for melody-based search

queries will be most useful to the average person

if hummed or sung input is a means of specifying

input queries. Given this allowance, there will

always be errors due to the music transcription

process itself, even with the anticipated continued

improvement of such automated systems. Among

the most significant sources of uncontrollable

input error are these:

•

The recording environment often cannot be

controlled. MIR systems deployed in public

spaces or which rely on wired or wireless

telephone transmission invariably will be

subject to ambient noise, generating false

notes.

•

Our own experience with test subjects

showed the difficulty of properly adjusting

the input level to minimize errors due to

the natural volume of the subject as well as

the subject's relative position with respect

Intelligent Music Information Systems: Tools and Methodologies

Search WWH ::

Custom Search

Home