Game Development Reference
In-Depth Information
One way of estimating 3-D motion is the explicit computation of an optical flow
field (Horn et al., 1981; Barron et al., 1994; and Dufaux et al., 1995), which is
followed by the derivation of motion parameters from the resulting dense
displacement field (Netravali et al., 1984, Essa et al., 1994; and Bartlett et al.,
1995). Since the computation of the flow field from the optical flow constraint
equation (Horn et al., 1981), which relates image gradient information (Simoncelli,
1994) to 2-D image displacements, is an underdetermined problem, additional
smoothness constraints have to be added (Horn, 1986; Barron et al., 1994). A
non-linear cost function (Barron et al., 1994) is obtained that is numerically
minimized. The use of hierarchical frameworks (Enkelmann, 1988; Singh, 1990;
and Sezan et al., 1993) can reduce the computational complexity of the
optimization in this high-dimensional parameter space. However, even if the
global minimum is found, the heuristical smoothness constraints may lead to
deviations from the correct flow field, especially at object boundaries and depth
discontinuities.
In model-based motion estimation, the heuristical smoothness constraints are,
therefore, often replaced by explicit motion constraints derived from the 3-D
object models. For rigid body motion estimation (Kappei, 1988; Koch, 1993), the
3-D motion model, specified by three rotational and three translational degrees
of freedom, restricts the possible flow fields in the image plane. Under the
assumption of perspective projection, known object shape, and small motion
between two successive video frames, an explicit displacement field can be
derived that is linear in the six unknown degrees of freedom (Longuet, 1984;
Netravali et al., 1984; and Waxman et al., 1987). This displacement field can
easily be combined with the optical flow constraint to obtain a robust estimator
for the six motion parameters. Iterative estimation in an analysis-synthesis
framework (Li et al., 1993) removes remaining errors caused by the linearization
of image intensity and the motion model.
For facial expression analysis, the rigid body assumption can no longer be
maintained. Surface deformations due to facial expressions have to be consid-
ered additionally. Most approaches found in the literature (Ostermann, 1994;
Choi et al., 1994; Black et al., 1995; Pei, 1998; and Li et al., 1998) separate this
problem into two steps. First, global head motion is estimated under the
assumption of rigid body motion. Local motion caused by facial expressions is
regarded as noise (Li et al., 1994b) and, therefore, the textured areas around the
mouth and the eyes are often excluded from the estimation (Black et al., 1995;
and Li et al., 1994b). Given head position and orientation, the remaining residuals
of the motion-compensated frame are used to estimate local deformations and
facial expressions. In (Black et al., 1995; Black et al., 1997), several 2-D motion
models with six (affine) or eight parameters are used to model local facial
deformations. By combining these models with the optical flow constraint, the
unknown parameters are estimated in a similar way as in the rigid body case.
Search WWH ::




Custom Search