Identifying Saxophonists from Their Playing Styles - Intelligent Music Information Systems: Tools and Methodologies

Information Technology Reference

In-Depth Information

Figure 4. Energy envelope and its linear approximation of a real excerpt with intranote segment limits

marked

is defined as the segment between the start of

the most negative of the computed slopes and the

note offset. Sustain is restricted to the remaining

segment. When the end of attack and the start of

release limits of a note coincide, it is considered

that the note does not present sustain segment.

Intranote segment characterization. Once we

have found the intranote segment limits, we de-

scribe each one by its duration (absolute and rela-

tive to note duration), start and end times, initial

and final energy values (absolute and relative to

note maximum) and slope. For the stable part of

each note (sustain segment), we extract an aver-

aged spectral centroid and spectral tilt in order to

have timbral descriptors related to the brightness

of a particular execution. We compute the spectral

centroid as the frequency bin corresponding to

the barycenter of the spectrum, expressed as (5),

where fft is the fast fourier transform of a frame,

N is the size of the fast fourier tarnsform, and k

is the bin index. For the spectral tilt, we perform

a linear regression of the logarithmic spectral

envelope between 2kHz and 6kHz, and get the

slope expressed in dB/Hz.

performance-drIven

Interpreter IdentIfIcatIon

In this section, we describe our approach to the

problem of recognizing saxophonists from their

playing style. In particular, we introduce the dif-

ferent note descriptors we use to characterize the

internal and contextual note properties (computed

as described in the previous section), as well as

the different algorithms we apply to identify

interpreters from their playing style.

note descriptors

We characterize each performed note by the fol-

lowing two sets of features:

•

Perceptual (intranote) features. The percep -

tual features represent perceptual properties

of a note which are specified as intranote

characteristics of the audio signal. The set

of perceptual features we have included in

the research reported here are the note's at-

tack level, sustain duration, sustain slope,

amount of legato with the previous note,

amount of legato with the following note,

mean energy, spectral centroid and spectral

tilt. This is, each performed note is percep-

tually characterized by the tuple

N

∑

k fft k

( )

SC

=

k

=

1

(5)

N

∑

fft k

( )

k

=

1

Intelligent Music Information Systems: Tools and Methodologies

Search WWH ::

Custom Search

Home