Digital Signal Processing Reference
In-Depth Information
Spectrum Tilt
Voiced speech has higher energy in low frequencies and unvoiced speech
usually has higher energy in high frequencies resulting in opposite spec-
tral tilts. The spectral tilt can be represented by the first-order normalized
autocorrelation or first reflection coefficient.
N
s(i)s(i
1 )
i
=
1
St
=
(6.44)
N
s 2 (i)
i
=
1
This is a very reliable parameter especially for plosive detection and to avoid
individual spikes in low-level signals. As can be seen from Figure 6.25,
its ability to indicate unvoiced and voiced sounds in general is also very
accurate.
Figure 6.25 Speech waveform and its spectral tilt with a possible voicing threshold
of 0.25 (shown by the dashed line)
Search WWH ::




Custom Search