Content-Based Indexing of Symbolic Music Documents - Intelligent Music Information Systems: Tools and Methodologies

Information Technology Reference

In-Depth Information

F i g u re 1 . P o s s i ib l le o u t c o m le s o f t h le l le x i c a l a n a l y s i s o f a s i m p l le m le l o d y ;

the bars in grey are alternative segmentations in melodic lexical units

Even after a sequence of features has been

extracted automatically from a music document,

lexical analysis of music documents remains a

difficult task. The reason is that music language

lacks of explicit separators between candidate

index terms for all of its dimensions. Melodic

phrases are not contoured by particular signs or

sounds that express the presence of a boundary be-

tween two phrases. The same applies to harmonic

progressions, or rhythmic patterns. In all these

cases there is no additional symbol that expresses

the ending of a lexical unit and the beginning of

the next one. This is not surprising, because the

same concept of lexical unit is borrowed from the

textual domain, and it is not part of the traditional

representation of music documents. Even if there

is a wide consensus in considering music as a

structured organization of different elements, and

not just a pure sequence of sounds, there was no

historical need to represent directly this aspect.

Music is printed for musicians, who basically need

the information to create a correct performance,

and who could infer the presence of basic elements

from the context. Different approaches have been

proposed for lexical analysis, considering musical

patterns (Hsu, Liu & Chen, 1998), main themes

(Meek & Birmingham, 2003), or musical phrases

(Melucci & Orio, 1999).

The consistency between musicians in per-

forming the lexical analysis of some monophonic

written scores has been investigated in a percep-

tual study, which is presented in the next section.

The lexical analysis of music documents is still

an open problem, both in terms of musicological

analysis because alternative theories have been

presented, and in terms of indexing and retrieval

effectiveness. As an example, Figure 1 reports

possible segmentations of the same musical

excerpt.

stop-Words removal

Many words that are part of a textual document

have only a grammatical function, and do not

express any semantics. In most languages articles,

conjunctions, prepositions and so on, can be

deleted without substantially affecting the com-

prehension of the text. Moreover, if after indexing

a document is described by a simple list of terms

in a given order (i.e., alphabetical), the fact that

the documents contained a particular conjunc-

tion does not give any additional information on

the document content. These words can thus be

removed, or stopped from which the term “stop-

words”, from the output of the lexical analysis

without affecting the overall performance of an

indexing system. Given that stop-words of this

kind are very frequent in textual documents, their

removal improves the system performances in

terms of storage needed for the indexes and thus

on the computational cost of the retrieval. For any

particular language a list of stop-words, named

stop-list , can be derived from a priori knowledge

of the grammatical rules.

Stop-words removal can be applied also to

words that, though carrying a semantic that could

be used to describe the document content, are

extremely frequent inside a collection of docu-

ments. For example, the lexical analysis of a set of

documents on music processing will very likely

Intelligent Music Information Systems: Tools and Methodologies

Search WWH ::

Custom Search

Home