Information Technology Reference
In-Depth Information
F i g u re 1 . P o s s i ib l le o u t c o m le s o f t h le l le x i c a l a n a l y s i s o f a s i m p l le m le l o d y ;
the bars in grey are alternative segmentations in melodic lexical units
Even after a sequence of features has been
extracted automatically from a music document,
lexical analysis of music documents remains a
difficult task. The reason is that music language
lacks of explicit separators between candidate
index terms for all of its dimensions. Melodic
phrases are not contoured by particular signs or
sounds that express the presence of a boundary be-
tween two phrases. The same applies to harmonic
progressions, or rhythmic patterns. In all these
cases there is no additional symbol that expresses
the ending of a lexical unit and the beginning of
the next one. This is not surprising, because the
same concept of lexical unit is borrowed from the
textual domain, and it is not part of the traditional
representation of music documents. Even if there
is a wide consensus in considering music as a
structured organization of different elements, and
not just a pure sequence of sounds, there was no
historical need to represent directly this aspect.
Music is printed for musicians, who basically need
the information to create a correct performance,
and who could infer the presence of basic elements
from the context. Different approaches have been
proposed for lexical analysis, considering musical
patterns (Hsu, Liu & Chen, 1998), main themes
(Meek & Birmingham, 2003), or musical phrases
(Melucci & Orio, 1999).
The consistency between musicians in per-
forming the lexical analysis of some monophonic
written scores has been investigated in a percep-
tual study, which is presented in the next section.
The lexical analysis of music documents is still
an open problem, both in terms of musicological
analysis because alternative theories have been
presented, and in terms of indexing and retrieval
effectiveness. As an example, Figure 1 reports
possible segmentations of the same musical
excerpt.
stop-Words removal
Many words that are part of a textual document
have only a grammatical function, and do not
express any semantics. In most languages articles,
conjunctions, prepositions and so on, can be
deleted without substantially affecting the com-
prehension of the text. Moreover, if after indexing
a document is described by a simple list of terms
in a given order (i.e., alphabetical), the fact that
the documents contained a particular conjunc-
tion does not give any additional information on
the document content. These words can thus be
removed, or stopped from which the term “stop-
words”, from the output of the lexical analysis
without affecting the overall performance of an
indexing system. Given that stop-words of this
kind are very frequent in textual documents, their
removal improves the system performances in
terms of storage needed for the indexes and thus
on the computational cost of the retrieval. For any
particular language a list of stop-words, named
stop-list , can be derived from a priori knowledge
of the grammatical rules.
Stop-words removal can be applied also to
words that, though carrying a semantic that could
be used to describe the document content, are
extremely frequent inside a collection of docu-
ments. For example, the lexical analysis of a set of
documents on music processing will very likely
Search WWH ::




Custom Search