Content-Based Indexing of Symbolic Music Documents - Intelligent Music Information Systems: Tools and Methodologies

Information Technology Reference

In-Depth Information

compared with the documents in the collection

using approximate string matching. For example,

approximate string matching has been proposed

in one of the earliest paper on music retrieval

(Ghias, Logan, Chamberlin & Smith, 1995) while

Dynamic Time Warping has been proposed in Hu

and Dannenberg (2002). Statistical approaches

have been proposed as well, in particular Markov

chains (Birmingham, Dannenberg, Wakefield,

Bartsch, Bykowski & Mazzoni, 2001) and hid-

den Markov models (Shifrin, Pardo, Meek &

Birmingham, 2002). The advantage of these ap-

proaches is that the difference between the query

and the documents can be modeled, considering

explicitly all the possible mismatches. Thus very

high performances in terms of retrieval effective-

ness can be achieved. On the other hand, all these

techniques require that the string representing the

query is matched against all the documents in

the collection, giving a complexity that is linear

with the number of documents in the collection.

Scalability to large collections of millions of

documents becomes then an issue.

For this reason alternative approaches have

been proposed that take advantage from indexing

(Doraisamy & Rüger, 2004; Downie & Nelson,

2000; Melucci & Orio, 2004; Pienimäki, 2002).

Moreover, other IR techniques can be applied to

music retrieval. For instance, Hoashi, Matsumoto

and Inoue (2003) applied relevance feedback

to a melodic retrieval task, with the main goal

of personalization of the results. The metaphor

of navigation inside a collection of documents,

which corresponds to document browsing, has

also been proposed (Blackburn & DeRoure, 1998).

On the other hand, indexing is also widely used

to retrieve or recognize music in audio format,

in particular for audio fingerprint and audio wa-

termarking techniques (Cano, Batlle, Kalker &

Haitsma, 2005).

This chapter describes some aspects of content-

based indexing, as opposed to metadata indexing,

giving a review of its basic concepts and going

in more detail about some key aspects, such as

the consistency at which candidate index terms

are perceived by listeners, the effectiveness of

alternative approaches to compute indexes, and

how individual indexing schemes can be combined

together by applying data fusion approaches.

metadata vs. content-Based

IndexIng

The first problem that arises when choosing an

indexing scheme for a music collection regards

the most effective representation of documents

content, in particular whether documents have to

be described by external metadata or directly by

a synthetic representation of their content. Both

approaches have positive and negative aspects.

Metadata usually requires extensive manual

work for retrieving external information on the

documents and for representing in a compact

way most of the subtleties of document content,

but it increases the cost of indexing and does not

guarantee consistency when different documents

are indexed by different persons. Automatic com-

putation of metadata based on external resources

has been proposed in systems for collaborative

filtering aimed, for example, at recommendation

systems, but the results are in terms of similarity

between documents and are biased by the pres-

ence of scattered data (Stenzel & Kamps, 2005).

At the state of the art they do not seem suitable

for a retrieval task. Content-based indexing is car-

ried out starting from a set of features extracted

automatically from the document itself, and it is

the main focus of this chapter.

metadata

For most media, such as images and video, the

choice of textual metadata proved to be par-

ticularly effective. Textual metadata as a tool to

describe and indexing music is a natural choice

that has been made for centuries (Dunn & Mayer,

1999). In general metadata, especially in the form

Intelligent Music Information Systems: Tools and Methodologies

Search WWH ::

Custom Search

Home