Digital Signal Processing Reference
In-Depth Information
However, modern DSP techniques make the computational complexity of
frequency-domain PDAs insignificant, making them very popular in sinu-
soidal coders. In the following, we briefly explain two frequency-domain
PDAs.
Harmonic Peak Detection
An obvious way of determining the pitch in the frequency domain would be to
extract the spectral peak at the fundamental frequency. This requires the first
harmonic to be present, which cannot, in general, be expected because of the
front-end filtering. A more practical method is to detect all of the harmonic
peaks and then measure the fundamental frequency (pitch frequency) as
either the common divisor of these harmonics or the spacing of the adjacent
harmonics. This can be done using a comb filter given by
W(kω 0 )
1 , 2 , ... ω 0
;
ω
=
0 ,k
=
=
C(ω, ω 0 )
(6.11)
0
;
otherwise
and correlating it with the speech spectrum. The output of the correlation,
A c 0 ) , is the summation of weighted comb peaks as,
m 0
ω 0
m
2 π
τ max
2 π
τ min
A c 0 )
=
S(kω 0 )W(kω 0 )
ω 0
(6.12)
k
=
1
where m is the maximum frequency considered in the speech spectrum.
If ω 0 is equal to the fundamental frequency, the comb response will match
the harmonic peaks, and the maximum output will be obtained as shown in
Figure 6.4. In order to obtain better subjective quality, a weighting coefficient
can be applied to the individual teeth, normally decreasing weights with
increasing frequency [16].
Spectrum Similarity
This method assumes that the spectrum is fully voiced and is composed only
of a number of harmonics each located at multiples of the pitch frequency. A
synthetic spectrum is reconstructed using this assumption for each possible
pitch frequency candidate and is compared to the original spectrum. The
pitch frequency leading to the best matching reconstructed spectrum is then
selected [13] as the fundamental or pitch frequency. The speech spectrum is
assumed to be composed of voiced harmonics only, located at multiples of
the candidate pitch frequency ω 0 . Therefore the synthetic spectrum S(m, ω 0 )
is an approximation of the convolution of pulses located at multiples of the
candidate pitch frequency ω 0 ,bythespectrum W of the window used on
Search WWH ::




Custom Search