Digital Signal Processing Reference
In-Depth Information
Context-sensitive HMM
Mono-phone HMM
100
y
80
60
40
20
0
5
10
20
30
40
Tolerance range (ms)
Fig. 2. Automatic segmentation accuracy rate comparison charts
8
Conclusions
In this paper, presents the automatic segmentation of Uyghur speech unit through
mono-phone HMM model and context-sensitive HMM model, Experiments show that
the performance of context-sensitive HMM segmentation is higher than the mono-
phone based HMM automatic segmentation method. But, segmentation method of
mono-phone HMM model is simple, we can use this method while can' not automati-
cally generate context-sensitive tagged lab file.
Automatic segmentation method presents in this paper segmented the speech unit
boundary accurately and consistently, through segmenting the boundary of phoneme,
combining with the rules of syllables and words in Uyghur achieved the boundaries of
syllables and words, can further achieved the boundary of prosodic word, prosodic
phrase , intonation phrase and sentence. Thus can be saved a lot of time and effort,
eliminated the mechanical steps to reduce the workload of sound library construction,
Improve the accuracy of the speech corpus annotation.
Acknowledgements. This work is supported by Program for New Century Excellent
Talents in University (NCET-10-0969), and Natural Science Foundation of China
(No. 61062008, 61065005), and Key Technologies R&D Program of China
(2009BAH41B03).
References
[1] Paulo, S., Oliveira, L.C.: DTW-based phonetic alignment using multiple acoustic features.
In: Proceeding of Euro Speech, Geneva, Switzerland, pp. 309-312 (2003)
[2] Mamateli, G., Ruzi, A., Hamdulla, A.: Uyghur sentence selection algorithm of thriphone
model. Computer Engineering and Applications 45(18), 242-244 (2009)
[3] Ruzi, A.: Research and Implementation of HMM Based Uyghur Speech Synthesis System.
Xinjiang University (2008)
[4] Memeteli, G.: Research and Implementation of key Technologies in UTTS Based on two-
level Speech Units and Prosodic Parameters, pp. 9-14. School of information science and
engineering, Urumqi (2009)
[5] Huang, X.D., Acero, A., Hon, H.W.: Spoken language processing, pp. 304-316. Prentice
Hall PTR, Upper Saddle River (2001)
 
Search WWH ::




Custom Search