Robotics Reference
In-Depth Information
ing by being able to recognize only the words for the digits 0 to 9. After a
brief training session with a new voice, Audrey would dial correctly, most
of the numbers much of the time, but it was rarely able to cope with a
complete seven-digit telephone number correctly.
Audrey was trained on Balashek's voice and had stored in the machine
the speech patterns (spectrograms) of those ten words, as he spoke them.
When Audrey heard a spoken word, the incoming sound was first ana-
lyzed by its electronics, and then a spectrogram pattern for the word was
produced. Then this pattern was automatically compared with the stored
spectrogram patterns for Balashek's voice for each of the ten digits, in or-
der to find the nearest match. Within about 0.2 seconds the machine
illuminated a bulb to indicate which of the ten digits it thought had
been spoken, choosing the most probable word (i.e., the closest match),
unless the incoming sound bore almost no resemblance to any of the ten
stored patterns (in which case Audrey simply did nothing).
Audrey performed quite well when Balashek spoke to the machine
but with other male speakers Audrey made mistakes ten to thirty percent
of the time, depending on the characteristics of the speaker's voice. And
with female and children's voices the results were rather poor because of
the different pitches and other voice characteristics.
Search WWH ::




Custom Search