Applications in Intelligent Speech Analysis - Intelligent Audio Analysis

Digital Signal Processing Reference

In-Depth Information

Table 10.23

Age and gender baseline results obtained by SMO learnt pairwise SVM with linear

Kernel

Sub-Ch.

Task

[

]

[

]

Train versus develop

{

,...,

}

44.24

44.40

Age

{

,...,

}→{

}

47.11

46.17

{

}

46.22

45.85

Gender

{

,...,

}→{

}

77.28

84.60

{

}

76.99

86.76

Train + develop versus test

{

,...,

}

44.94

45.60

Age

{

,...,

}→{

}

48.83

46.71

{

}

48.91

46.24

Gender

{

,...,

}→{

}

81.21

84.81

{

}

80.42

86.26

Table 10.24 Selected speaker independent results for height (H) recognition on the TIMIT corpus

test partition; contextual information by feature inclusion of age (A), gender (G), American English

dialect (D), education level (E), race (R) or all of these (All)

Context CC MLE [cm]

- 0.296 7.05

R 0.286 7.09

G 0.299 7.01

A 0.314 6.94

A,G 0.317 6.91

A,R 0.302 7.00

G,R 0.290 7.05

A,G,R 0.304 6.98

All 0.306 7.07

CC, MLE for regression (speaker height in centimetres). 1 582 acoustic features, classification by

SVR with linear Kernel, SMO, complexity 0.05

Table 10.24 next depicts results of the speaker height assessment task in strict

speaker independence by employing TIMIT's training and test partitions as stated

above and exclusively adding speaker contextual meta-information by selected (pairs

of) supplementary traits as additional feature(s) to the acoustic vector. Given the

case of regression and a continuous ordinal task formulation, CC and MLE are the

measures of performance. Gains can be observed by gradual addition of ground-truth

supplementary speaker trait information aside of the target task. Little improvement

is found for the height recognition task by gender inclusion (1.2 % relative correlation

improvement), age inclusion (6.2 %) and combined age and gender inclusion (7.3 %)

with the latter being the only significant one. Interestingly, age inclusion helps more

for the assessment of height than gender inclusion, even though all speakers can

be assumed to have reached their maximal height given their ages above maturity.

Intelligent Audio Analysis

Search WWH ::

Custom Search

Home