Digital Signal Processing Reference
In-Depth Information
MBE Mixed Voicing
The voicing decision is made by examining the normalized distance
D
k
between the original and estimated speech spectra in frequency bands,
b
k
−
S(m, ω
0
)
2
a
k
|
S(m)
|
m
=
D
k
=
(6.59)
b
k
2
a
k
|
|
S(m)
m
=
where
ω
0
is the refined fundamental frequency,
a
k
and
b
k
are the first and last
harmonic in the
k
th
band,
S(m)
is the original speech spectrum, and
S(m, ω
0
)
is the reconstructed speech spectrum which is calculated using:
S(m, ω
0
)
=
A
l
(ω
0
)W(m)
1
≤
l
≤
L,
a
l
≤
m <
b
l
(6.60)
means the nearest integer greater
than or equal to,
L
is the number of harmonics within the 4 kHz speech
bandwidth,
W(m)
is the frequency response of a suitable window centred at
the
l
th
harmonic of the fundamental frequency (see Figure 6.31) and
A
l
(ω
0
)
is
where
a
l
=
(l
−
0
.
5
)ω
0
,
b
l
=
(l
+
0
.
5
)ω
0
,
.
40.0
20.0
0.0
−
20.0
−
40.0
−
60.0
−
4.0
−
2.0
0.0
2.0
4.0
Frequency (kHz)
Figure 6.31
Frequency response of the Hamming window
Search WWH ::
Custom Search