Digital Signal Processing Reference
In-Depth Information
normalized parameters are given by,
(Ps
−
Th
ps
)/(Ps
max
−
Th
ps
)
;
if Ps > Th
ps
Ps
=
(6.51)
(Ps
−
Th
ps
)/(Th
ps
−
Ps
min
)
;
if Ps < Th
ps
(Pk
−
Th
pk
)/(Pk
max
−
Th
pk
)
;
if Pk > Th
pk
Pk
=
(6.52)
(Pk
−
Th
pk
)/(Th
pk
−
Pk
min
)
;
if Pk < Th
pk
(Th
zc
−
Zc)/(Th
zc
−
Zc
min
)
;
if Zc < Th
zc
Zc
=
(6.53)
(Th
zc
−
Zc)/(Zc
max
−
Th
zc
)
;
if Zc > Th
zc
(St
−
Th
st
)/(St
max
−
Th
st
)
;
if St > Th
st
St
=
(6.54)
(St
−
Th
st
)/(Th
st
−
St
min
)
;
if St < Th
st
(LF
−
Th
lf
)/(LF
max
−
Th
lf
)
;
if LF > Th
lf
LF
=
(6.55)
(LF
−
Th
lf
)/(Th
lf
−
LF
min
)
;
if LF < Th
lf
(Th
pr
−
Pr)/(Th
pr
−
Pr
min
)
;
if Pr < Th
pr
Pr
=
(6.56)
(Th
pr
−
Pr)/(Pr
max
−
Th
pr
)
;
if Pr > Th
pr
(E
0
−
Th
v
)/(E
max
−
Th
v
)
;
if voiced
Fe
=
(E
0
−
Th
uv
)/(Th
uv
−
E
min
)
;
if unvoiced
(6.57)
0
;
if not sure
where
Th
ps
,
Th
pk
,
Th
zc
,
Th
st
,
Th
lf
and
Th
pr
are fixed voicing thresholds for the
pitch similarity, peakiness, zero crossing, spectral tilt, low-band to full-band
energy ratio, and pre-emphasized energy ratio respectively, and
Th
v
and
Th
uv
are adaptive voiced and unvoiced thresholds used to compare the frame
energy. The overall voicing indicator
V
is then computed by combining the
contributions of all indicators.
w
1
Ps
+
w
2
Pk
+
w
3
Zc
+
w
4
St
+
w
5
LF
+
w
6
Pr
+
w
7
Fe
=
V
(6.58)
The weights
w
1
, ... ,w
7
are chosen according to the reliability of each indica-
tor. The sign of the voicing
V
will indicate voiced when positive and unvoiced
when negative. If
V
is close to zero it will indicate an unsure case, and the
voicing of the previous frame could be used to increase reliability. Further-
more, in cases where
V
δ
where
δ
has a small value (indicating an unsure
case), individual voicing parameters can be checked to see if one or more of
them has a clear indication of voiced or unvoiced. This can be achieved by
selecting two further thresholds for each parameter, one indicating voiced
and the other unvoiced. These thresholds must be selected by carrying out
long simulations. Typically
Ps
can be above 0.7 for voiced and below 0.3 for
=±
Search WWH ::
Custom Search