Digital Signal Processing Reference
In-Depth Information
The range of the pitch search was limited to between 15 and 150 samples.
Spectral analysis was conducted using a 240-sample Hamming window and a
256-point FFT with 16-sample zero padding. When computing the TA, a 240-
sample rectangular or Hamming window was applied to the input signals.
Pitch error decisions were checked in each frame by comparing the detected
pitch period with the reference. A frame was classified as erroneous if the
absolute difference between the reference and the detected pitch periods was
more than 1 ms (8-sample) as in [1]. Extra processing, such as pitch tracking
using the pitch history of the past frames, was not incorporated in order
to evaluate only the main algorithmic contributions. Although the unvoiced
speech regions were not taken into account, transitions were included in the
performance evaluations as these regions are perceptually very important.
Analysis of the STA Weighting Factor
The effect of the STA rate α in terms of E p is shown in Figure 6.10. The
results show that the STA gives improved performance compared with
the TA and the SA, corresponding to α
=
1and α
=
0, respectively. The
lowest E p was obtained when α
=
0 . 5 for both the female and male speech
samples.
Analysis of the SS-SA Weighting Factor
The weighting factor β of SS - SA in equation (6.27) was analysed by varying
β between 0 and 1 (see Figure 6.11). As in STA, the SS - SA also shows much
less E p in comparison with those of the SS and the SA, corresponding to
β
=
1and β
=
0, respectively. Additionally, the lowest E p values were
30
Male /Rectangular
25
Male / Tapered
Female / Rectangular
20
Female / Tapered
15
10
5
0
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
STA weighting rate ( a )
Figure 6.10 Analysis of the effect of the STA weighting factor α in terms of the pitch
error rate; the formant weighting factor γ is 0.9
Search WWH ::




Custom Search