Tools for Music Information Retrieval and Playing - Intelligent Music Information Systems: Tools and Methodologies

Information Technology Reference

In-Depth Information

difficult. Therefore, it is normally assumed that

music is monophonic (produced using a single

instrument) and stored as scores in the database.

The pitch of each note is known. The common

query input form is humming. To improve pitch

tracking performance on the query input, a pause is

normally required between consecutive notes.

There are two pitch representations. In the

first method, each pitch except the first one is

represented as pitch direction (or change) relative

to the previous note. The pitch direction is either

U(up), D(down) or S(similar). Thus, each musical

piece is represented as a string of three symbols

or characters.

The second pitch representation method rep-

resents each note as a value based on a chosen

reference note. The value is assigned from a set

of standard pitch values that is closest to the es-

timated pitch. If we represent each allowed value

as a character, each musical piece or segment is

represented as a string of characters. But in this

case, the number of allowed symbols is much

greater than the three that are used in the first

pitch representation.

After each musical piece is represented as a

string of characters, the final stage is to find a match

or similarity between the strings. Considering

that humming is not exact and the user may be

interested in find similar musical pieces instead

of just the same one, approximate matching is

used instead of exact matching. The approximate

matching problem is that of string matching with

k mismatches. The variable k can be determined

by the user of the system. The problem consists

of finding all instances of a query string Q =

q1q2q3 . . . qm in a reference string R = r1r2r3

. . . rn such that there are at most k mismatches

(characters that are not the same). There are sev-

eral algorithms that have developed to address the

problem of approximate string matching (Wold

et al., 1996).

Both the systems of Muscle Fish LLC (Wold

et al., 1996) and the University Waikato produced

good retrieval performance. But the perform-

ance depends on the accuracy of pitch tracking

of hummed input signals. High performance is

only achieved when a pause is inserted between

consecutive notes.

Since humming is the most natural way to

formulate music queries for people who are not

trained or educated with music theory. Therefore

many researchers have proposed techniques for

query-by-humming.

Many techniques based on melody matching

are proposed, and some of them can support query-

by-humming (Logan, Ghias, & Chamberlin, 1995;

Zhu & Kankanhalli, 2002).

For a query-by-humming system to work well,

reliable pitch detection in the humming is critical.

There are 2 types of query-by-humming methods:

(1) the method based on music notes segmenta-

tion and matching; (2) and the method based on

continuous pitch contour matching.

The first type of methods (Pollastri, 2002)

requires the user to separate each music note

with a short silence or hum with a particular

syllable (such as “Da”). By such restrictions, it

is assumed that each individual note can he ac-

curately segmented by using signal energy. The

pitch for the segmented note is then estimated.

The restrictions, however, make the methods less

practical for a real-world music retrieval system,

since a user cannot always be aware of the note

boundaries particularly when there are tied notes

in the melody. Figure 20 illustrates the flow of

such pitch detection method.

The second type of query-by-humming

methods does not impose the above mentioned

restrictions. The pitch value for each audio (a short

time window) is estimated, and then melody is

represented using a pitch contour, or a time series

of pitch values with no music note identities.

Music retrieval is done by similarity match-

ing of the pitch contours (Jang, 2001; Zhu, 2002).

These approaches have shown a better perfor-

mance than the first type of method. Although

note segmentation error does not exist in this

Intelligent Music Information Systems: Tools and Methodologies

Search WWH ::

Custom Search

Home