Digital Signal Processing Reference
In-Depth Information
Table 11.27
Number of beats per trait, class and partition in the UltraStar singer trait database
# Beats
Train
Devel
Test
Sum
No voice (0)
90 076
75 741
48 948
214 765
Gender
Female (f)
32 308
23 071
9 739
65 118
Male (m)
55 505
49 497
37 686
142 688
?
86
253
771
1 110
Race
White (w)
67 525
62 003
40 479
170 007
b/h/a
16 378
9 465
7 136
32 979
?
3 996
1 353
581
5 930
Age
Young (y)
48 510
42 056
25 682
116 248
Old (o)
34 074
24 596
18 712
77 382
?
5 315
6 169
3 802
15 286
Height
Small (s) 29 638 24 946 8 562 63 146
Tall (t) 30 177 30 146 23 452 83 775
? 28 084 17 729 16 182 61 995
Sum 177 975 148 562 97 144 423 681
'b/h/a': black / hispanic / asian. 'Unknown' (?) includes simultaneous performance of artists of
different gender/race, and those with unknown ground truth
separation approach as described in [ 170 , 171 ] is additionally applied: Starting from
the STFT of the audio signal at frame n , denoted
[
S
] : , n , the spectrum is expressed
as the sum of two independent components as
[
S
] : , n =[
V
] : , n +[
M
] : , n , where
[
V
] : , n
is the STFT of the leading voice, and
[
M
] : , n is the one of the background musical
signal parts.
[
V
] : , n and
[
M
] : , n are assumed to be centre proper complex Gaussian
variables 14 :
2
[
[
V
] : , n N c (
0
,
diag
] : , n )),
(11.37)
V
2
[
[
M
] : , n N c (
0
,
diag
] : , n )),
(11.38)
M
2
[
2
[
where
] : , n is the power spectral density (PSD) of the leading
voice or respectively of the background music at frame n . Assuming independence
between the two components, the STFT of the observed signal then is also a proper
Gaussian vector:
σ
] : , n or respectively
σ
V
M
2
2
[
] : , n N c (
,
[ V ] : , n + σ
[ M ] : , n )).
S
0
diag
(11.39)
2
[
2
[
Then,
] : , n are estimated per signal frame n . In the present use-case, the
approach is completely unsupervised, i.e., no learning takes place. Rather, it relies on
σ
and
σ
V
] : , n
M
14 A complex random variable whose real and imaginary parts are independent and follow a real
Gaussian distribution, with mean equal to 0 and identical variance or co-variance matrix in case of
a multi-variate distribution.
 
Search WWH ::




Custom Search