Information Technology Reference
In-Depth Information
The effect of alternative approaches to seg-
mentation is shown in Figure 5, where the lexi-
cal units highlighted by different algorithms are
graphically shown. The algorithms are the ones
included in the MidiToolbox (Eerola & Toiviainen,
2004) and correspond, from the top, to PB, to a
probabilistic approach not tested in the present
study, and to MO.
Table 3. Main characteristics of the index terms
obtained from the different segmentations tech-
niques
FL
DD
PB
MO
Average length
3.0
4.8
4.0
3.6
Average units/document
52.1
61.9
43.2
45.0
Number of units
70093
123654
70713
67893
characteristics of the Index terms
The comparison has been carried out according
to the Cranfield model for information retrieval.
A music test collection of popular music has been
created with 2310 MIDI files as music documents.
MIDI is a well- known standard for the representa-
tion of music documents that can be synthesized
to create audible performances (Rothstein, 1991).
MIDI is becoming obsolete both as a format for the
representation of music to be listened to because
of the widespread diffusion of compressed audio
formats such as MP3, and as a format for represent-
ing notated music because of the creation of new
formats for analyzing, structuring and printing
music (Selfridge-Field, 1997). The availability
of large collections of music files in MIDI is the
main reason why this format is still widely used
for music retrieval experiments.
From the collection of MIDI files, the chan-
nels containing the melody have been extracted
automatically and the note durations have been
normalized; the highest pitch has been chosen
as part of the melody for polyphonic channels
(Uitdenbogerd & Zobel, 1998). After preprocess-
ing, the collection contained complete melodies
with an average length of 315.6 notes. A set of
40 queries, with average length of 9.7 notes, has
been created as recognizable examples of both
choruses and refrains of 20 randomly selected
songs. Only the theme from which the query was
taken was considered as relevant, considering a
query-by-examples paradigm where the example
is an excerpt of a particular work that needs to be
retrieved. This assumption simplifies the creation
of the relevance judgments that can be built au-
tomatically. Alternatively, relevance judgments
can be created using a pool of excerpt that may
find that more than a document is relevant to a
particular query (Typke, den Hoed, de Nooijer,
Wiering & Veltkamp, 2005). The initial queries did
not contain errors and had a length that allowed
for a clear recognition of the main theme. The
robustness of errors has been tested by modify-
ing notes pitch and duration, while the effect of
query length has been tested by shortening the
original queries.
Table 3 shows the main characteristics of lexi-
cal units, and thus of the index terms, extracted
with the segmentation approaches, giving a pre-
liminary idea on how each segmentation approach
describes the document collection. The values
reported in the table have been computed with
the following experimental setup: FL has been
computed with N-grams of three notes; DD has
been computed applying a threshold of five notes;
PB and MO have been computed using the algo-
rithms presented in Eerola and Toiviainen (2004).
For these four approaches, units were sequences
of couples of values, pitch and duration, and the
index is built with one entry for each different
sequence.
The approaches gave comparable results in
terms of average length of lexical units, which is
about three to four notes, and also in the average
number of different units per document. This
behavior is different from the results given by
the perceptual study on manual segmentation,
 
Search WWH ::




Custom Search