Robotics Reference
In-Depth Information
that sound wrong when enunciated using the rules, are stored intact in
an exceptions dictionary together with their phonetic representation.
Conversion of the input text into a sequence of phonetic symbols is
thus performed by a combination of letter-to-sound rules and an excep-
tions dictionary, resulting in a sequence of phonetic symbols for each
word. This sequence readily translates into a sequence of phonemes .A
phoneme is a written representation of the smallest unit of sound that
can distinguish words, smallest in the sense that if one phoneme in a
word is changed, the word is pronounced differently. 13
Corresponding to each phoneme in a language there is an allophone ,
the speech sound that is represented on paper by the phoneme, 14 and it
is the allophones that enable us to complete the conversion of text into
speech. Each allophone can be extracted from the recorded speech of
someone speaking a word containing that allophone. And given the se-
quence of phonemes corresponding to a word, it is fairly straightforward
to reproduce the corresponding sequence of allophones, strung together
to create the spoken form of the word, thereby completing the process.
In summary the process is
whole text
individual words
phonetic symbols
phonemes
allophones
spoken words
But this is not the end of the story. There are still two major problems
to be overcome in the creation of natural sounding speech. Firstly, when
allophones are strung together and played out as a word, the complete
sound is often not exactly what one expects to hear for that word and
sometimes it can be quite an unpleasant sound. This is because of what
are called discontinuities between two successive allophones. A disconti-
nuity occurs when one allophone ends in a sound at a particular pitch
level (frequency) and/or a particular volume, while the next allophone
starts at a noticeably different pitch level and/or volume. The jump from
the sound at the end of one allophone to the sound at the start of the
next, manifests itself in a screeching sound or some other distracting
effect.
13 Different languages have different numbers of phonemes and allophones, ranging from only
ten for the language of the Piraha people of Brazil to 141 (including several for clicking sounds) for
one of the Khosian languages of Southern Africa. The exact number of phonemes used in English
varies from one speaker to another—typically it is between 40 and 45.
14 The words “phoneme” and “allophone” are often confused, with the former being used incor-
rectly in place of the latter.
Search WWH ::




Custom Search