Uncanny Speech - Game Sound Technology and Player Interaction: Concepts and Developments

Information Technology Reference

In-Depth Information

Chatbot character, whilst others moved around

the screen, for example the Tank and the Witch.

A further study is required to determine the actual

causality of lip-synchronization as a significant

contributor towards the uncanny when not asso-

ciated with other factors of facial animation and

sound. Thus, we intend a further experiment to

test the hypothesis: Uncanniness increases with

increasing perceptions of lack of synchronization

between the character's lips and the character's

sound .

At present there are no standards set for accept-

able levels of asynchrony for computer games as

there are for television. It may well be that these

acceptable levels are the same across the two

media but it might equally be the case that the

interactive nature of computer games and the use

of different reproduction technologies and para-

digms propose a different standard. For example,

perhaps it is the case that current technological

limitations in automated lip-syncing tools require

a smaller window of acceptable asynchrony for

computer games than previously established for

television. We hope the future experiment noted

above will also ascertain if viewers are more

sensitive to an asynchrony of speech for virtual

characters where the audio stream precedes video

(as has been previously identified for the televi-

sion broadcasting industry).

animation and expression, this technique is mostly

useful for FMV. Recorded motions are difficult

to modify once transferred to a three-dimensional

model and the digital representation of the mouth

remains an area requiring further modification.

Editing motion capture data often involves careful

key-framing by a talented animator. A developer

may edit individual frames of existing motion

capture data for prerecorded trailers and cut scenes

yet, for computer games, most visual material is

generated in real-time during gameplay. For in-

game play, automatic simulation of the muscles

within and surrounding the mouth is necessary

to match mouth movement with speech. Motion

capture by itself cannot be used for automated

facial animation.

To create automatic visual simulation of mouth

movement with speech, computer game engines

require a set of visemes as the visual representation

for each phoneme sound. Faceposer (Valve, 2008)

uses the phoneme classes phonemes , phonemes

strong , and phonemes weak with a corresponding

viseme to represent each syllable within the In-

ternational Phonetic Alphabet (IPA). Prerecorded

speech is imported into a phoneme extractor tool

that extracts the most appropriate phoneme (and

corresponding viseme) for recognized syllables.

Editing tools allow for the creation of new pho-

neme classes, or to modify the mouth shape for

an existing viseme.

The UM study (Tinwell & Grimshaw, 2010)

identified a strong relationship between how

uncanny a character was perceived to be with a

perceived exaggeration of facial expression for the

mouth. The results implied that those characters

perceived to have an over-exaggeration of mouth

movement were regarded as more strange. Thus,

uncanniness increases with increasing exaggera-

tion of articulation of the mouth during speech.

Finer adjustments to mouth shapes using tools

such as Faceposer may prevent a perceived over-

exaggeration of articulation of speech, yet such

adjustments are time consuming for the developer.

If no original visual footage is available for speech,

ArtIcULAtION OF sPEEcH

Hundreds of individual muscles contribute to

the generation of complex facial expressions and

speech. As one of the most complex muscular

regions of the human body, and with increased real-

ism for characters, generating realistic animation

for mouth movement and speech is a challenge for

designers (Cao et al., 2004; Plantec, 2007). Even

though the dynamics of each of these muscles is

well understood, their combined effect is very dif-

ficult to simulate precisely. Whilst motion capture

allows for the recording of high fidelity facial

Game Sound Technology and Player Interaction: Concepts and Developments

Search WWH ::

Custom Search

Home