Image Processing Reference
In-Depth Information
Indeed, these 3 parts of the head are really involved in the animation. The amount of vertexes
can be reduced with a post-processing task with a related decrease of quality, which is not
severe if this process involves the back and top sides of the head. Moreover, for each polygon
mesh a texture should be loaded, but all the meshes can use the same image file as texture to
save memory. A basic viseme can provide both the image texture and the texture coordinates
to allow the correct position of the common texture for the other ones.
4.2 ANIMATION
The facial movements are performed by morphing . Morphing starts from a sequence of
geometry objects called “keyframes”. Each keyframe's vertex translates from its position to
occupy the one of the corresponding vertex in the subsequent keyframe. For this reason we
have to generate a set of visemes instead of modifying a single head geometric model. Such
an approach is less efficient than an animation engine able to modify the shape according
to facial parameters (tongue position, labial protrusion and so on) but it simplifies strongly
the programming level: First, the whole mesh is considered in the morphing process, and
efficient morphing engines are largely present in many computer graphics libraries. Various
parameters have to be set to control each morphing step between two keyframes, i.e. the
translation time. In our animation scheme, the keyframes are the visemes related to the
phrase to be pronounced but they cannot be inserted in the sequence without considering
the facial coarticulation to obtain realistic facial movements. The coarticulation is the natural
facial muscles modification to generate a succession of fundamental facial movements during
phonation. The Löfqvist gestural model described in Löfqvist (1990) controls the audio-visual
synthesis; such a model defines the “dominant visemes”, which influence both the preceding
and subsequent ones. Each keyframe must be blended dynamically with the adjacent ones.
The next section is devoted to this task, showing a mathematical model for the coarticulation.
4.2.1 COHEN-MASSARO MODEL
The Cohen-Massaro model Cohen & Massaro (1993) computes the weights to control the
keyframe animation. Such weights determine the vertexes positions of an intermediate mesh
between two keyframes. It is based on the coarticulation, which is the influence of the adjacent
speech sounds to the actual one during the phonation. Such a phenomenon can be also
considered for the interpolation of a frame taking into account the adjacent ones in such a
way that the facial movement appear more natural. Indeed, the Cohen-Massaro model moves
from the work by Löfqvist, where a speech segment shows the strongest influence on the
organs of articulation of the face than the adjacent segments. Dominance is the name given
to such an influence and can be mathematically defined as a time dependent function. In
particular, an exponential function is adopted as the dominance function. The dominance
function proposed in our approach is simplified with respect to the original one. Indeed,
it is symmetric. The profile of a dominance function for given speech segment S and facial
parameter P is expressed by the following equation:
D SP = α · EX P (− θ | τ | C ) (3)
where α is the peak for τ = 0, θ and C control the function slope and τ is the time variable
referred to the mid point of the speech segment duration. In our implementation we set C = 1
to reduce the number of parameters to be tuned. The dominance function reachs its maximum
value ( α ) in the mid point of speech segment duration, where τ = 0. In the present approach,
we assume that the time interval of each viseme is the same of the duration of the respective
phoneme. The coarticulation can be thought as composed by two sub-phenomenons: the
pre- and post- articulation. The former consists in the influence of the present viseme on the
Search WWH ::




Custom Search