Digital Signal Processing Reference
In-Depth Information
(a)
(b)
0.4
20
0.2
10
0
0
−0.2
−10
−0.4
20
(c)
(d)
0.4
10
0.2
0
0
−10
−0.2
−20
−0.4
20
(e)
0.5
(f)
0
0
−0.5
−20
0
40
80
0
2000
4000
Number of Samples
Frequency (Hz)
Fig. 2.7 Pitch synchronous analysis. a , c and e Three consecutive pitch cycles of the speech signal.
b , d and f Corresponding spectra
independently processed for extracting the spectral features. Some of the important
intuitions behind using pitch synchronous analysis of speech signals are as follows:
The illogical approach associated with physical framing of the speech signal practiced
in block processing, can be eliminated by carrying the analysis of speech signal within
a pitch period. The assumption that a speech signal is stationary within the frame
of 20 ms is not completely acceptable as both source and system are continuously
varying with respect to time. In this work, pitch periods are marked using glottal
closure instants (GCIs). A signal between two consecutive GCIs is treated as one
pitch cycle. A zero frequency filter based method is used to determine the GCIs [ 13 ].
LPCCs, MFCCs and formant features are computed for each pitch cycle of a speech
signal.
2.4 Classifiers
GMMs and AANNs are known to capture the general distribution of data points in the
feature space and one can be used as an alternative to the other [ 14 ]. Two classifiers
are used in this study, to mutually compare their emotion classification results.
 
 
Search WWH ::




Custom Search