Information Technology Reference
In-Depth Information
Chatbot character, whilst others moved around
the screen, for example the Tank and the Witch.
A further study is required to determine the actual
causality of lip-synchronization as a significant
contributor towards the uncanny when not asso-
ciated with other factors of facial animation and
sound. Thus, we intend a further experiment to
test the hypothesis: Uncanniness increases with
increasing perceptions of lack of synchronization
between the character's lips and the character's
sound .
At present there are no standards set for accept-
able levels of asynchrony for computer games as
there are for television. It may well be that these
acceptable levels are the same across the two
media but it might equally be the case that the
interactive nature of computer games and the use
of different reproduction technologies and para-
digms propose a different standard. For example,
perhaps it is the case that current technological
limitations in automated lip-syncing tools require
a smaller window of acceptable asynchrony for
computer games than previously established for
television. We hope the future experiment noted
above will also ascertain if viewers are more
sensitive to an asynchrony of speech for virtual
characters where the audio stream precedes video
(as has been previously identified for the televi-
sion broadcasting industry).
animation and expression, this technique is mostly
useful for FMV. Recorded motions are difficult
to modify once transferred to a three-dimensional
model and the digital representation of the mouth
remains an area requiring further modification.
Editing motion capture data often involves careful
key-framing by a talented animator. A developer
may edit individual frames of existing motion
capture data for prerecorded trailers and cut scenes
yet, for computer games, most visual material is
generated in real-time during gameplay. For in-
game play, automatic simulation of the muscles
within and surrounding the mouth is necessary
to match mouth movement with speech. Motion
capture by itself cannot be used for automated
facial animation.
To create automatic visual simulation of mouth
movement with speech, computer game engines
require a set of visemes as the visual representation
for each phoneme sound. Faceposer (Valve, 2008)
uses the phoneme classes phonemes , phonemes
strong , and phonemes weak with a corresponding
viseme to represent each syllable within the In-
ternational Phonetic Alphabet (IPA). Prerecorded
speech is imported into a phoneme extractor tool
that extracts the most appropriate phoneme (and
corresponding viseme) for recognized syllables.
Editing tools allow for the creation of new pho-
neme classes, or to modify the mouth shape for
an existing viseme.
The UM study (Tinwell & Grimshaw, 2010)
identified a strong relationship between how
uncanny a character was perceived to be with a
perceived exaggeration of facial expression for the
mouth. The results implied that those characters
perceived to have an over-exaggeration of mouth
movement were regarded as more strange. Thus,
uncanniness increases with increasing exaggera-
tion of articulation of the mouth during speech.
Finer adjustments to mouth shapes using tools
such as Faceposer may prevent a perceived over-
exaggeration of articulation of speech, yet such
adjustments are time consuming for the developer.
If no original visual footage is available for speech,
ArtIcULAtION OF sPEEcH
Hundreds of individual muscles contribute to
the generation of complex facial expressions and
speech. As one of the most complex muscular
regions of the human body, and with increased real-
ism for characters, generating realistic animation
for mouth movement and speech is a challenge for
designers (Cao et al., 2004; Plantec, 2007). Even
though the dynamics of each of these muscles is
well understood, their combined effect is very dif-
ficult to simulate precisely. Whilst motion capture
allows for the recording of high fidelity facial
Search WWH ::




Custom Search