Digital Signal Processing Reference
In-Depth Information
A binary class can be chosen by the signum of S ,or S serves as feature for classifi-
cation or regression in combination with data-driven analysis.
6.4 Supra Segmental Features
After having discussed and introduced various types of acoustic and symbolic LLD,
in this section we will have a look at the principle of supra segmental analysis by
feature brute-forcing.
The basis is provided by statistical 'functionals', which are applied to an audio
chunk and map each LLD's time series of varying length to a single value per func-
tional. Examples of functionals are the mean, minimum, maximum, or standard devi-
ation or the ones shown in Table 6.2 . Such mappings are also referred to as aggregate
features or feature summaries . Further, delta coefficients, moving average, or vari-
ous filter types are commonly applied to low-level descriptors. Hierarchies of such
post-processing steps have proven to lead to more robust features, e.g., in [ 9 ], hierar-
chical functionals, i.e., 'functionals of functionals' are used. This consequently leads
to the novel principle of Analytic Feature (AF) generation [ 93 ]: A large number of
LLD derivations and subsequent functional application in a systematic manner, i.e.,
applied to each LLD, results in brute-forcing of up to several thousands of audio
features.
The principle of feature brute-forcing together with LLD extraction will be illus-
trated in the next section based on the open-source Speech and Music Interpretation
by Large-space Extraction (openSMILE 6 ) toolkit, a fast feature extractor and signal
processing tool [ 94 ].
6.5 Audio Feature Extraction: The openSMILE Toolkit
openSMILE's aim is to unite features typically used fro the different types of audio
signals—speech, music, and sound—as were introduced so far. This shall enable
research in either domain to benefit from features from the other domains and to
facilitate general Intelligent Audio Analysis.
A strong focus is put on fully supporting real-time, incremental processing.
openSMILE provides a simple, scriptable console application where modular feature
extraction components can be freely configured and connected via configuration files.
Most of the individual feature extraction functions are usable as library functions and
can be integrated into existing applications. Both incremental on-line processing for
live applications and off-line batch processing is supported. Unit tests are provided
for developers to ensure exact numeric compatibility with future versions.
6
Available at: http://opensmile.sourceforge.net/ .
 
Search WWH ::




Custom Search