Content-Based Indexing of Symbolic Music Documents - Intelligent Music Information Systems: Tools and Methodologies

Information Technology Reference

In-Depth Information

most used dimension in music retrieval. Another

interesting comparison of approaches to music

retrieval has been presented in Hu and Dannen-

berg (2002), where the focus was on alternative

representations for a dynamic programming

approach, both from the retrieval effectiveness

and from the computational cost points of view.

In the presented study the computational costs of

the tested approaches were comparable, and thus

result are not reported.

The organization of a number of evaluation

campaigns by the research community working

on the different aspects of music access, retrieval,

and feature extraction (IMIRSEL, 2006), which

started in 2005 (preceded in 2004 by an evalu-

ation effort on audio analysis), will increasingly

allow for the comparison of different approaches

to music indexing, using standard collections

(Downie, Futrelle & Tcheng, 2004).

straightforward, and can be carried out in linear

time. The idea underlying this approach is that

the effect of musically irrelevant N-grams will be

compensated by the presence of all the musically

relevant ones. It is common practice to choose

small values for N, typically from 3 to 7 notes,

because short units give higher recall, which is

considered more significant than the subsequent

lowering in terms of precision. Fixed-length seg-

mentation can be extended to polyphonic scores,

with the aim to extract all relevant monophonic

tokens from concurrent multiple voices (Do-

raisamy & Rüger, 2004).

Data-Driven Segmentation (DD)

Segmentation can be performed considering

that typical passages of a given melody tend to

be repeated many times (Pienimäki, 2002). The

repetitions can simply be due to the presence of

different choruses in the score or can be related

to the use of the same melodic material along

the composition. Each sequence that is repeated

at least K times—normally twice—is usually

defined a pattern , and is used for the description

of a music document. This approach is called data-

driven because patterns are computed only from

the document data without exploiting knowledge

on music perception or structure. This approach

can be considered as an extension of the N-grams

approach, because DD units can be of any length,

with the limitation that they have to be repeated

inside the melody—subpatterns that are included

in longer patterns are discarded, if they have the

same multiplicity. Patterns can be computed from

different features, like pitch or rhythm, each fea-

ture giving a different set of DD units to describe

document content. Patterns can be truncated by

applying a given threshold, to reduce the size of

the index and to achieve a higher robustness to

local errors in the query (Neve & Orio, 2004). The

extension to polyphonic scores can be carried out

similarly to the FL approach.

approaches to melodic

segmentation

The approaches to music segmentation can be

roughly divided in two main groups: the ones that

highlight the lexical units using only the document

content, and the ones that exploit prior informa-

tion about the music theory and perception. Four

different approaches, two for each group, have

been tested.

Fixed-Length Segmentation (FL)

The simplest segmentation approach consists of

the extraction from a melody of subsequences

of exactly N notes, called N-grams (Downie &

Nelson, 2000). N-grams may overlap, because no

assumption is made on the possible starting point

of a theme, neither on the possible repetitions of

relevant music passages. The strength of this ap-

proach is its simplicity, because it is based neither

on assumption on theories on music composition

or perception, nor on analysis of complete melo-

dies. The exhaustive computation of FL units is

Intelligent Music Information Systems: Tools and Methodologies

Search WWH ::

Custom Search

Home