Digital Signal Processing Reference
In-Depth Information
1) Initializing the mono-phone HMM model
2) Through EM (Expectation maximization) algorithm find out the maximum like-
lihood estimation of mono-phone model parameters.
3) Extent the mono-phone HMM model into context-based triphone HMM model.
4) Find out the maximum likelihood estimation of tri-phone model parameters via
EM algorithm.
5) make a state clustering by classification and recession tree to
the states of tri-
model
6) gradually increase the number of Gaussian mixture components in each state of
the tri-phone HMM model
7) Find out the maximum likelihood estimation of the model parameters by through
iteration
8) Repeat the step (6) and (7) several times.
In this paper, we achieved two kind of models after HMM training based on the train-
ing data, The first model is monophone HMM model(monophone.mmf),this model
includes different mono-phone models which appear in training corpus; The second
model is context-sensitive tri-phone model after clustering(clustered.mmf).
phone
5
Automatic Phoneme Segmentation
In this paper, we realized the segmentation process by using Hvite in HTK tool, be-
low is the segmentation process:
1) Calculate the spectrum parameters and basic frequencies through wave which
need to be segmented.
2) Create the lab file through tagged file.
3) Make model predictions through HHEd.
4) Achieve the segmented file after Hvite segmenting
Figure 1 shows the automatic segmentation process. the two segmentation methods
used in this paper are basically same, the difference is that in the mono-phone auto-
matic segmentation, provides the corresponding mono-phone list file and mono-phone
HMM model of every sentence which need to be segmented , but in context-
sensitive HMM model segmentation ,provides context-sensitive tagged lab file and
tri-phone HMM model; Another difference is that in mono-phone HMM model seg-
mentation, do not need to model prediction(segmentation step 3), but in context-
sensitive HMM model automatic segmentation it requires model prediction.
Spectrum parameters
Hvite segmentation module
Speech data
Fundamental frequency
Segmented wave data
Labels
Fig. 1. Automatic segmentation processes
Search WWH ::




Custom Search