Audio Data - Intelligent Audio Analysis

Digital Signal Processing Reference

In-Depth Information

Chapter 5

Audio Data

It is a capital mistake to theorize before one has data. Insensibly

one begins to twist facts to suit theories, instead of theories to

suit facts.

— Sir Arthur Conan Doyle

5.1 Audio Data Requirements

In order to train and test intelligent audio systems, audio data is needed. In fact, this is

often considered as one of the main bottle necks and the common opinion is that there

is “no data like more data”. However, there are several pre-requisites apart from the

sheer quantity of the data, and in fact, obtaining considerable amounts of data can be

difficult and laboursome [ 1 ], also, as data usually also needs to be labelled. Table 5.1

provides an overview on the most relevant of these requirements when building an

(audio) database for learning and testing of classifiers and regressors.

To reach annotations with labels y n for instance n of the Intelligent Audio Analy-

sis task of interest with reduced cost, new methods for community or distributed

annotation such as crowd sourcing, e.g., by Amazon Mechanical Turk 1 will be of

interest. If one further wants to reduce the amount of audio data prior to the labelling

to those instances that will likely result in the best gain for the system, the field of

active learning provides solutions to this end [ 2 ]. In addition, to obtain even larger

amounts of data without typically involved efforts in annotation, uniting of data-

bases for training [ 3 ] and semi-supervised learning techniques have recently been

shown beneficial [ 4 , 5 ]. In particular the latter allows for exploitation of practi-

cally infinite amounts of data, such as on-line available audio and audiovisual video

streams. A more complex, yet also very promising alternative was shown in [ 6 ], where

1

Search WWH ::

Custom Search

Home