Content-Based Indexing of Symbolic Music Documents - Intelligent Music Information Systems: Tools and Methodologies

Information Technology Reference

In-Depth Information

The effect of alternative approaches to seg-

mentation is shown in Figure 5, where the lexi-

cal units highlighted by different algorithms are

graphically shown. The algorithms are the ones

included in the MidiToolbox (Eerola & Toiviainen,

2004) and correspond, from the top, to PB, to a

probabilistic approach not tested in the present

study, and to MO.

Table 3. Main characteristics of the index terms

obtained from the different segmentations tech-

niques

Average length

3.0

4.8

4.0

3.6

Average units/document

52.1

61.9

43.2

45.0

Number of units

70093

123654

70713

67893

characteristics of the Index terms

The comparison has been carried out according

to the Cranfield model for information retrieval.

A music test collection of popular music has been

created with 2310 MIDI files as music documents.

MIDI is a well- known standard for the representa-

tion of music documents that can be synthesized

to create audible performances (Rothstein, 1991).

MIDI is becoming obsolete both as a format for the

representation of music to be listened to because

of the widespread diffusion of compressed audio

formats such as MP3, and as a format for represent-

ing notated music because of the creation of new

formats for analyzing, structuring and printing

music (Selfridge-Field, 1997). The availability

of large collections of music files in MIDI is the

main reason why this format is still widely used

for music retrieval experiments.

From the collection of MIDI files, the chan-

nels containing the melody have been extracted

automatically and the note durations have been

normalized; the highest pitch has been chosen

as part of the melody for polyphonic channels

(Uitdenbogerd & Zobel, 1998). After preprocess-

ing, the collection contained complete melodies

with an average length of 315.6 notes. A set of

40 queries, with average length of 9.7 notes, has

been created as recognizable examples of both

choruses and refrains of 20 randomly selected

songs. Only the theme from which the query was

taken was considered as relevant, considering a

query-by-examples paradigm where the example

is an excerpt of a particular work that needs to be

retrieved. This assumption simplifies the creation

of the relevance judgments that can be built au-

tomatically. Alternatively, relevance judgments

can be created using a pool of excerpt that may

find that more than a document is relevant to a

particular query (Typke, den Hoed, de Nooijer,

Wiering & Veltkamp, 2005). The initial queries did

not contain errors and had a length that allowed

for a clear recognition of the main theme. The

robustness of errors has been tested by modify-

ing notes pitch and duration, while the effect of

query length has been tested by shortening the

original queries.

Table 3 shows the main characteristics of lexi-

cal units, and thus of the index terms, extracted

with the segmentation approaches, giving a pre-

liminary idea on how each segmentation approach

describes the document collection. The values

reported in the table have been computed with

the following experimental setup: FL has been

computed with N-grams of three notes; DD has

been computed applying a threshold of five notes;

PB and MO have been computed using the algo-

rithms presented in Eerola and Toiviainen (2004).

For these four approaches, units were sequences

of couples of values, pitch and duration, and the

index is built with one entry for each different

sequence.

The approaches gave comparable results in

terms of average length of lexical units, which is

about three to four notes, and also in the average

number of different units per document. This

behavior is different from the results given by

the perceptual study on manual segmentation,

Intelligent Music Information Systems: Tools and Methodologies

Search WWH ::

Custom Search

Home