Game Development Reference
In-Depth Information
Owens distinguishes, however. In fact, we only consider two cases for each
cluster: rounded and widened, that represent the instances farthest from the
neutral expression. For instance, the viseme associated with /m/ differs depend-
ing on whether the speaker is uttering the sequence omo or umu vs. the
sequence eme or imi . In the former case, the /m/ viseme assumes a rounded
shape, while the latter assumes a more widened shape. Therefore, each
consonant was assigned to these two types of visemes. For the visemes that
correspond to vocals, we used those proposed by Montgomery et al. (1985).
As shown in Figure 2, the selection contains a total of 20 visemes: 12 representing
the consonants (boxes with “consonant” title), seven representing the monophtongs
(boxes with title “monophtong”) and one representing the neutral pose (box with
title “silence”). Diphtongs (box with title “diphtong”) are divided into two,
separate monophtongs and their mutual influence is taken care of as a co-
articulation effect. The boxes with the smaller title “allophones” can be
discarded by the reader for the moment. The table also contains examples of
words producing the visemes when they are pronounced. This viseme selection
differs from others proposed earlier. It contains more consonant visemes than
most, mainly because the distinction between the rounded and widened shapes
is made systematically. For the sake of comparison, Ezzat and Poggio (Ezzat et
al., 2000) used six (only one for each of Owens' consonant groups, while also
combining two of them), Bregler et al . (1997) used ten (same clusters, but they
subdivided the cluster /t,d,s,z,th,dh/ into /th,dh/ and the rest, and /k,g,n,l,
ng,h,y/ into /ng/ , /h/ , /y/ , and the rest, what boils down to making an even more
precise subdivision for this cluster), and Massaro (1998) used nine (but this
Figure 2. Overview of the visemes used.
Search WWH ::




Custom Search