Applications in Intelligent Music Analysis - Intelligent Audio Analysis

Digital Signal Processing Reference

In-Depth Information

Generally speaking, 'rhythm' describes patterns of changes. In music, a 'beat'

corresponds to the perceived pulses which mark off equal durational units and is our

basis of comparison for measurements of rhythmic durations. The 'tempo' refers to

the beats' 'striking rate', whereas 'metre' represents accent structure of the beats.

Considering 'metre', the metrical structure of a musical piece is composed of multiple

hierarchical levels [ 66 ]. There, the tempo on higher levels is an integer multiple of

the one on the lowest level, which is also referred to as 'tatum' level. When we tap

along with a song, we do this on the 'pulse' or 'beat' level, which can be referred to

as the quarter-note tempo. The 'bar' or 'measure' level corresponds to the unit of a

bar in notated music. The relation between measure and beat level then is the metre

or 'time signature' of a musical piece.

Current tempo detection algorithms mostly base on periodicity detection:

Autocorrelation, resonant filter banks or onset time statistics (cf. Sect. 11.2 ) are some

examples as summarised in [ 51 ]. Very few approaches, however, aim at synergis-

tic common or combined assessment of tempo together with related information

such as metre or beat-tracking to provide a robust basis for higher level tasks, such

as ballroom dance style or genre recognition. Further, few studies introduce data-

driven genre and metre recognition [ 67 , 68 ]. Others [ 69 - 71 ] use rhythmic feature

information for specialised tasks such as audio identification.

In this section, an approach for robust data-driven rhythm analysis is discussed.

To this end, LLDs modelling rhythmic information are presented that are tailored to

classify duple and triple metre and ballroom dance styles. Once these are determined,

the information is used to reliably assess the quarter-note tempo and avoid 'octave'

errors, i.e., doubling, tripling, halving, etc., of the tempo by mistake.

The determination of tempo, metre, and (on-)beat positions [ 25 ] can be roughly

divided into two major principles:

The first strategy starts with the location of onsets in the audio (or sym-

bolic notation such as MIDI) as was shown in the last section. Then, the desired

determination tasks are based on the analysis of the inter-onset intervals (IOIs)

[ 72 - 78 ]. To this end, histogram approaches are found most frequently [ 13 , 75 ].

There, duration and weight of all possible IOIs are calculated. IOIs are binned by

similarity clustering and the clusters are arranged in a histogram. From the weights

and the centres of the clusters the tempo of several metrical levels can be estimated.

Alternatively, rule-based approaches are employed [ 13 ]. Or, exclusively the Tatum

pulse, i.e., the fastest tempo present in a piece is computed by choosing the clus-

ter with the centre of the smallest IOI [ 75 ]. Then, within a window around each

Tatum pulse features are extracted and the Tatum pulses are classified, e.g., by

Bayesian methods, with respect to their perceived accentuation. By that, the beat

level is detected based on the assumption that beats are more accented than off-beat

pulses.

In the second strategy to determine tempo, metre, and (on-)beat positions the

order is inverted, i.e., after analysis of tempo and metrical structure onset positions

are retrieved. In this case, resonator methods or the related correlation approaches

are commonly used. Onset localisation then benefits from the knowledge gained

throughout tempo detection [ 5 , 13 , 14 , 16 , 19 , 79 ]. This second strategy tends to lead

Intelligent Audio Analysis

Search WWH ::

Custom Search

Home