Digital Signal Processing Reference
In-Depth Information
Chapter 5
Audio Data
It is a capital mistake to theorize before one has data. Insensibly
one begins to twist facts to suit theories, instead of theories to
suit facts.
Sir Arthur Conan Doyle
5.1 Audio Data Requirements
In order to train and test intelligent audio systems, audio data is needed. In fact, this is
often considered as one of the main bottle necks and the common opinion is that there
is “no data like more data”. However, there are several pre-requisites apart from the
sheer quantity of the data, and in fact, obtaining considerable amounts of data can be
difficult and laboursome [ 1 ], also, as data usually also needs to be labelled. Table 5.1
provides an overview on the most relevant of these requirements when building an
(audio) database for learning and testing of classifiers and regressors.
To reach annotations with labels y n for instance n of the Intelligent Audio Analy-
sis task of interest with reduced cost, new methods for community or distributed
annotation such as crowd sourcing, e.g., by Amazon Mechanical Turk 1 will be of
interest. If one further wants to reduce the amount of audio data prior to the labelling
to those instances that will likely result in the best gain for the system, the field of
active learning provides solutions to this end [ 2 ]. In addition, to obtain even larger
amounts of data without typically involved efforts in annotation, uniting of data-
bases for training [ 3 ] and semi-supervised learning techniques have recently been
shown beneficial [ 4 , 5 ]. In particular the latter allows for exploitation of practi-
cally infinite amounts of data, such as on-line available audio and audiovisual video
streams. A more complex, yet also very promising alternative was shown in [ 6 ], where
1
https://www.mturk.com/mturk/
 
Search WWH ::




Custom Search