3D Face Modeling - 3D Face Modeling, Analysis and Recognition

Graphics Reference

In-Depth Information

Three parameters encode normal information, while the remaining three contain tangential

motion information. Then, on the basis of the estimated local motion parameters, the whole

mesh is then deformed by minimizing the sum of the three energy terms.

+ η 1

+ η 2 E r v f i

ζ 2

− ζ 1 v f

v f

−

(1.14)

The first data term measures the squared distance between the vertex position v f

i and the

position v i estimated by the local estimation process. The second uses the discrete Laplacian

operator of a local parameterization of the surface in v i to enforce smoothness. [The values

ζ 1 =

4 are used in all experiments (Furukawa and Ponce, 2009)]. This term

is very similar to the Laplacian regularizer used in many other algorithms (Ponce, 2008).

The third term is also used for regularization, and it enforces local tangential rigidity with

no stretch, shrink, or shear. The total energy is minimized with respect to the 3D positions

of all the vertices by a conjugate gradient method. In case of deformable surfaces such as

human faces, nonstatic target edge length is computed on the basis of non-rigid tangential

deformation from the reference frame to the current one at each vertex. The estimation of the

tangential deformation is performed at each frame before starting the motion estimation, and

the parameters are fixed within a frame. Thus, the tangential rigidity term E r ( v i ) for a vertex

v f

6 and

ζ 2 =

in the global mesh deformation is given by

max 0

e ij −

e ij 2

− τ

(1.15)

v j

∈ N ( v i )

which is the sum of squared differences between the actual edge lengths and those predicted

from the reference frame to the current frame. The term

is used to make the penalty zero

when the deviation is small so that this regularization term is enforced only when the data term

is unreliable and the error is large. In all our experiments,

2 times the average

edge length of the mesh at the first frame. Figure 1.8 shows some results of motion capture

approach proposed in Furukawa and Ponce (2009).

Finally after surface deformation, the residuals of the data and tangential terms are used

to filter out erroneous motion estimates. Concretely, these values are first smoothed, and a

smoothed local motion estimate is deemed an outlier if at least one of the two residuals exceeds

a given threshold. These three steps are iterated a couple of times to complete tracking in each

frame, the local motion estimation step only being applied to vertices whose parameters have

not already been estimated or filtered out.

The face capture framework proposed by Bradley et al. (2010) operates without use of

markers and consists of three main components: acquisition, multiview reconstruction and

geometry, and texture tracking. The acquisition stage uses 14 high definition video cameras

arranged in seven binocular stereo pairs. At the multiview reconstruction stage, each pair

captures a highly detailed small patch of the face surface under bright ambient light. This stage

uses on an iterative binocular stereo method to reconstruct seven surface patches independently

that are merged into a single high resolution mesh; the stereo algorithm is guided by face details

providing, roughly, 1 million polygons meshes. First, depth maps are created from pairs of

is set to be 0

3D Face Modeling, Analysis and Recognition

Search WWH ::

Custom Search

Home