Robotics Reference
In-Depth Information
6. And finally the system believes that the word ends with either an
“a” sound (a probability of only 0.1) or with an “ow” (a probability
of 0.9).
The HMM for this particular recognition task is shown diagrammatically
as in Figure 35.
The individual probability estimates in the HMM allow the software
to calculate the overall probability of each of the possible combinations
of sounds:
tahmeyta
1
×
0
.
4
×
1
×
0
.
5
×
1
×
0
.
1
0
.
02
=
=
tahmeytow
1
×
0
.
4
×
1
×
0
.
5
×
1
×
0
.
9
0
.
18
=
=
tahmaata
1
×
0
.
4
×
1
×
0
.
5
×
1
×
0
.
1
0
.
02
=
=
tahmaatow
1
×
0
.
4
×
1
×
0
.
5
×
1
×
0
.
9
0
.
18
=
=
×
.
×
×
.
×
×
.
.
towmeyta
=
1
0
6
1
0
5
1
0
1
=
0
03
×
.
×
×
.
×
×
.
.
towmeytow
=
1
0
6
1
0
5
1
0
9
=
0
27
×
.
×
×
.
×
×
.
.
towmaata
=
1
0
6
1
0
5
1
0
1
=
0
03
×
.
×
×
.
×
×
.
.
towmaatow
=
1
0
6
1
0
5
1
0
9
=
0
27
So the recognition system would pick two phoneme strings as being
equally likely: “t ow m ey t ow” and “t ow m aa t ow”, each with a
probability of being correct of 0.27.
Contextual information can also be used on a whole word basis. Con-
sider for example the phrase “house of representatives”. If a speech recog-
nition system were to be reasonably certain that it recognized the words
“of ” and “representatives”, but was unsure as to whether the first word
in the phrase was “house”, “louse”, “mouse”, “nouse” or “rouse”, it could
simply look through a database of phrases to see which occurs most often
and by how much. Try it for yourself with your favourite search engine.
ah
ey
a
0.4
0.1
0.5
start
t
m
t
end
0.6
0.5
0.9
ow
aa
ow
Figure 35. The Hidden Markov Model for the word “tomato” (Courtesy of James
Matthews)
Search WWH ::




Custom Search