Game Development Reference
In-Depth Information
consonants, the context plays a stronger role. If they immediately follow a vocal
among /o/ , /u/ , and /@@/ (this is the vocal as in “bird”), then the allophone is
mapped onto a rounded consonant. If the vocal is among /i/ , /a/ , and /e/ then the
allophone is mapped onto a widened consonant. When the consonant is not
preceded immediately by a vocal, but the subsequent allophone is one, then a
similar decision is made. If the consonant is flanked by two other consonants, the
preceding vocal decides.
From these data — the ordered list of visemes and their timing — the system
automatically generates an animation. The concatenation of the selected visemes
can be achieved elegantly as a navigation through a “Viseme Space,” similar to
a Face Space. The Viseme Space is obtained by applying an Independent
Component Analysis to all extracted, example visemes. It came out that the
variation can be captured well with as few as 16 Independent Components. (This
underlying dimensionality is determined as the PCA step that is part of our ICA
implementation (Hyvärinen, 1997).) Every personalized viseme can be repre-
sented as one point in this 16D Viseme Space. Animation boils down to
subsequently applying the deformations represented by the points along a
trajectory that leads from viseme to viseme, and that is influenced by co-
articulation effects. An important advantage of animating in Viseme Space is
that all visited deformations remain realistic.
Performing animation as navigation through a Viseme Space of some sort is not
new per se . Such approach was already demonstrated by Kalberer and Van Gool
Figure 13. Fitting splines in the “Viseme Space” yields good co-articulation
effects, after attraction forces exerted by the individual nodes (visemes)
were learned from ground-truth data.
Search WWH ::




Custom Search