Digital Signal Processing Reference
In-Depth Information
The whole search, therefore, is divided into two procedures: subrange search
and subrange comparison. Since these subranges are all fully overlapped,
searching over the subranges only need be done twice, from left to right and
from right to left. We start with the range R 1 and compare its minimum with
the nonoverlapped part of R 2 and so on until all of the right hand side is
completed. The same procedure is applied to the left hand side starting with
R 4. Finally, the left hand and right hand side minima are compared and the
overall minimum is selected. We can also see that the number of comparisons
during the search is independent of the size of the pitch search ranges and is
equal to three times the number of pitch candidates.
Multiple Pitch and Half Pitch Errors
Almost all PDAs have a peak detector which decides the pitch by the peak
position. In time-domain methods for example, the peak to be detected is
not only positioned at the correct pitch lag, but also at its integer multiples.
Therefore it is possible that a multiple of the real pitch may be chosen. In
order to find the desired peak among the peaks, a complicated procedure
is normally needed. The basic idea for solving this problem includes two
steps: picking the maximum peak; checking the submultiple positions to see
if there is a comparable peak. However, since there is no fixed solution to this
problem, tuned comparison thresholds are generally used.
For example, in the case of the cross-correlation pitch estimation method,
the comparison is made by looking at the ratio R(τ 0 /i)/R(τ 0 ) where i is
an integer, which produces pitch submultiples greater than or equal to the
minimum expected pitch. The smallest submultiple which may produce a
ratio greater than the set threshold is selected as the pitch.
In frequency-domain methods, such as the spectrum similarity method,
a similar procedure can be applied. In this case, the average sum of the
harmonics in the signal may be used in the comparison. At every submultiple,
the average sum of harmonic magnitudes are computed by
L k
1
L k
A v k ) =
A(iω k )
;
k
=
1 , 2 , 3 , ... ,n.
(6.41)
i
=
1
where L k is the total number of harmonics in a 4 kHz speech bandwidth,
A(iω k ) are harmonic magnitudes and ω k
2 π
τ 0 /k is the fundamental frequency
of the k th submultiple of the initial pitch. The ratio between the A v k ) of the
smallest submultiple and the initial pitch, τ 0 , is then computed and compared
with a threshold which may vary for each submultiple. If this ratio is bigger
than the corresponding threshold, then the smallest submultiple is selected as
the pitch estimate. Otherwise the next largest submultiple is checked against
=
Search WWH ::




Custom Search