Motion Capture - Computer Vision for Visual Effects

Graphics Reference

In-Depth Information

et al. [ 129 ]. FACS decomposes an expression into “action units” related to the activity

of facial muscles, which an animator can control to create a character's expression.

Sifakis et al. [ 448 ] related facial markers to a highly detailed anatomical model

of the head that included bones, muscle, and soft tissue, using a nonlinear opti-

mization similar to the methods in Section 7.4.2 . Alternately, the facial markers can

be directly related to the vertices of a dense 3D mesh of the head's surface (e.g.,

[ 54 ]), acquired using laser scanning or structured light (both discussed in detail in

Chapter 8 ).

One of the earliest facial motion capture tests was described by Williams [ 547 ],

who taped dots of retro-reflective Scotchlite material to a performer's face and used

the dots' 2D positions to animate a 3D head model obtained using a laser scanner.

Guenter et al.'s seminal work [ 183 ] describedahomemademotioncapture framework

using 182 fluorescent dots glued to a performer's face that were imaged under ultra-

violet illumination. The triangulated 3D dot positions were used to move the vertices

of a 3D headmesh obtained using a laser scanner. Lin and Ouhyoung [ 284 ] described

a unique approach that uses a single video of a scene containing the performer and a

pair of mirrors, effectively giving three views of the markers from different perspec-

tives. In several recent films (e.g., TRON: Legacy , Avatar , and Rise of the Planet of the

Apes ), actors performed on set wearing facial markers whosemotionwas recorded by

a rigid rig of head-mounted cameras, in essence carrying miniature motion-capture

studios along with them (see Section 7.8 ).

On the other hand, marker-based technology is only part of the process of facial

capture for visual effects today. In particular, the non-marker-based MOVA Con-

tour system is extremely popular and is used to construct highly detailed facial

meshes and animation rigs for actors prior to on-set motion capture. With this

system, phosphorescent makeup is applied to the performer's entire face. Under

normal lighting, the makeup is invisible, but under fluorescent lighting, the makeup

glows green and has a mottled texture that generates dense, evenly spaced visual

features in the resulting images. The performer is filmed from the front by many

cameras, anddense, accurate 3Dgeometry is computedusingmulti-viewstereo tech-

niques, discussed in Section 8.3 . This technology was notably used in The Curious

Case of Benjamin Button . In related approaches, Furukawa and Ponce [ 159 ] painted

a subject's face with a visible mottled pattern, and Bickel et al. [ 44 ] augmented

facial markers with visible paint around a performer's forehead and eyes to track

wrinkles.

Facial capture techniques that require no markers or makeup are also a major

research focus in the computer vision and graphics communities. Bradley et al. [ 63 ]

described a system in which the performer's head is surrounded by seven pairs of

high-resolution stereo cameras zoomed in to use pores, blemishes, and hair follicles

as trackable features. The performer is lit by a bright array of LED lights to provide

uniform illumination. The 3D stereo reconstructions (i.e., stereo correspondence

followed by triangulation) are merged to create a texture-mapped mesh, and opti-

cal flow is used to propagate dense correspondence of the face images throughout

each camera's video sequence. This can be viewed as a multi-view stereo algorithm,

discussed in detail in Section 8.3 . Another major approach is the projection of struc-

tured light patterns onto a performer's face, which introduces artificial texture used

for multi-view stereo correspondence. This approach is typified by the work of Zhang

Computer Vision for Visual Effects

Search WWH ::

Custom Search

Home