Digital Signal Processing Reference
In-Depth Information
8.3.1 VoicingDetermination
There are many ways of performing the voicing classification of speech,
which was discussed in Chapter 6, but here we briefly summarize two
common techniques.
Multi-Band Approach
Harmonic voicing is estimated by computing the normalized mean squared
error of a synthetic voiced spectrum, S w (ω, ω 0 ) , with respect to the speech
spectrum, S w (ω) , and comparing it against a threshold function for each
harmonic band [6]. The normalized mean squared error, D k ,ofthe k th
harmonic band is given by,
(k
+
0 . 5 0
S w (ω)
S w (ω, ω 0 ) 2
(k
0 . 5 0
D k
=
for
k
=
1 , 2 , ... ,K
(8.1)
+
(k
0 . 5 0
S w (ω) dω
(k
0 . 5 0
= π / ω 0
where K
and ω 0 is the normalized fundamental frequency.
Figure 8.3 illustrates D k values of two speech spectra with the corresponding
synthetic spectra. If D k is below the threshold function, i.e. a small error
and a good spectral match, the k th band is declared voiced. The initial multi-
3
3
(a)
(a)
2
2
(b)
(b)
1
1
(c)
(c)
0
0
0
1000
2000
3000
4000
0
1000
2000
3000
4000
frequency (Hz)
frequency (Hz)
(a)
(b)
Figure 8.3 Two speech spectra: (a) original spectrum S w (ω) , (b) synthetic spectrum
S w (ω, ω 0 ) , and (c) normalized D k
Search WWH ::




Custom Search