Applications in Intelligent Sound Analysis - Intelligent Audio Analysis

Digital Signal Processing Reference

In-Depth Information

As classifier, Random Forests as ensemble of decision trees are used. This choice is

motivated by their good ability to cope with large feature spaces, as feature sub-spaces

are randomly assigned to the trees in the forest. A good configuration proved to be 30

trees, and 150 randomly assigned features for each tree. For further reproducibility

besides using an open-source feature extractor and the FindSounds database (cf.

Sect. 5.3.3 ) that can be retrieved from the Internet, the classifier implementation

provided by the Weka toolkit [ 22 ] is chosen again.

12.2.2 Performance

Considering the imbalance of instances among the classes, UA will be the mea-

sure of primary interest. Further, WA is partly provided in addition, as well as recall,

precision, and F 1 -measure. The experiments base on random partitioning of the Find-

Sounds database into three stratified folds to provide two training and one completely

disjoint testing set. The first fold (F1, 5 646 instances) is always used with its original

manually assigned labels for training. The second fold (F2, 5 646 instances) is used

either without its original labels (F2 U ) or with these labels (F2) to be able to compare

to using this fold in a semi-supervised or supervised manner for training. The third

and last fold (5 645 instances) is always used for testing. Random partitioning is

carried out with Weka's default random seed.

Table 12.5 shows the occurred confusions for seven categories of sound event

classification using the original labels training on fold one and two and testing on

the third fold. This is the 'best case' given the entirely supervised learning with

utmost data and serves as upper benchmark. Most confusions can be explained well

by common sense, such as those of sounds from people with sounds of animals or

sounds from vehicles with sounds of noise makers.

Table 12.5 'Best case' confusions when automatically classifying seven sound categories on the

FindSounds database with original labels for both training folds F1 and F2 (cf. line 'supervised

F1

+

F2' in Table 12.6 )

Truth [#]

Classified as

People

Animals

Nature

Vehicles

Noisemakers

Office

Instruments

People

564

153

11

26

17

25

50

Animals

126

717

7

35

23

20

18

Nature

18

35

157

42

44

10

6

Vehicles

37

26

476

86

15

45

Noisemakers

22

43

36

77

372

72

48

Office

29

37

1

16

111

364

31

Instruments

32

33

6

31

47

16

1 395

Confusions

264

338

87

227

328

158

198

Intelligent Audio Analysis

Search WWH ::

Custom Search

Home