Robotics Reference
In-Depth Information
per second—in accordance with the sound that is required. Articulatory
synthesizers imitate the effects of the physical human mouth on air as it
passes up through the vocal chords, with each element of a speech sound
described in terms of the position and movement of the mouth. Concate-
native synthesizers use a broad range of speech units, called allophones
or diphones, extracted from recordings of human speech, with linguistic
rules to select the appropriate speech units which are then linked in or-
der to produce the full speech sounds. The description of a TTS system
which follows is primarily of concatenative synthesis.
The first part of the conversion process from text consists of creat-
ing a phonetic representation of each word. There is an internationally
recognized set of phonetic symbols, the International Phonetic Alphabet,
and it is possible to write every word using these symbols in such a way
that someone familiar with the system of symbols could then pronounce
the word correctly. For example, the word “contract” looks like
in phonetic symbols.
The first task then for a TTS system is to create, from each word in
the text, the equivalent representation in phonetic symbols. The basis
for this particular part of the process is a set of what are called letter-
to-sound rules. These rules recognize small clusters of vowels and con-
sonants within words, and are often dependent on morphs —syllables or
other short strings of letters that typically make up prefixes, suffixes and
roots of words. For example “snow” is a single morph, but “snowplough”
is made up of two morphs.
Theformofatypicalruleis
When a precedes r ,and r is not followed by either a vowel
or another r within the same morph, a is pronounced AA
(as in far or cartoon ) unless it is preceded either by w (as
in warble , warp , war , wharf )orby qu (as in quarter ).
The earliest robust set of letter-to-sound rules for English were devised
by Honey Sue Elovitz, Rodney Johnson, Astrid McHugh and John Shore
at the U.S. Naval Research Laboratory in 1976, but neither that set of
rules nor later sets could convert anything like all the words in the dic-
tionary with satisfactory results. There are too many exceptions that run
counter to the rules, and too many quirks of linguistic pronunciation for
a relatively small set of rules to be able to cope completely, so the idea of
an exceptions dictionary was born. Nowadays TTS systems employ rules
that deal adequately with many words, while the exception words, those
Search WWH ::




Custom Search