Graphics Reference
In-Depth Information
where
is a temporal window of frames centered at the current time instant t 0 .
The basic observation is that the spatio-temporal profile centered at a pixel is more
discriminative than the spatial neighborhood alone, which is the same concept that
underlies the space-time analysis of structured light patterns illustrated inFigure 8.14 .
The number of frames in
T
and they way they are chosen is up to the user, and could
depend on the speed of motion in the scene; in the same way, the size of the window
W
T
depends on the image resolution and amount of scene texture.
The advantage of the space-time framework is that it's well suited to dynamic
scenes, since accurate correspondences may be difficult to intuit on a frame-by-
frame basis. Zhang et al. accounted for surfaces that are non-fronto-parallel and/or
moving using a linear prediction of each pixel's changing disparity (i.e., the spatio-
temporal windows are “slanted” along all three axes, not rectangular solids). They
later introduced further constraints to enforce spatio-temporal consistency in the
estimates by formulating the optimization globally [ 570 ].
Since the projector isn't calibrated against the cameras and the pattern isn't coded,
any pattern with high-frequency spatial detail will do. Zhang et al. [ 569 ] used shuffled
Gray codes and binary checkerboards, while Davis et al. [ 114 ] used random patterns
of binary vertical stripes and even a simple flashlight. Both groups based the cost
function C in Equation ( 8.13 ) simply on the sum of squared differences.
8.4
REGISTERING 3D DATASETS
3D data acquired using LiDAR or structured light from a single point of view suffers
from the shadowing problem illustrated in Figure 8.3 . That is, we only get a depth
estimate at a given pixel for the corresponding scene surface closest to the camera.
Therefore, we commonly move the scanner around the scene to acquire scans from
viewpoints that fill in the gaps and make the 3Dmodel more complete.
In this section, we address two key problems associated with this process. The first
is howtoalignmultiple 3Ddatasets into the same coordinate system.We take a similar
approach to the problem of 2D image alignment: features in each scan are detected,
matched, and used as the basis for estimating a parametric transformation between
each scan pair. However, in 3D we need different methods for feature detection and
registration, as we discuss in Sections 8.4.1 and 8.4.2 .
Once we have a method for aligning scans, the second problem is how to cre-
ate a usable triangular mesh from the resulting collection of points. Algorithms for
these problems of multiscan fusion and meshing are overviewed in Section 8.4.3 .
Throughout this section, wemotivate the algorithms using data acquired fromLiDAR
scanners, but the same methods apply to point clouds created from structured light
or multi-view stereo.
8.4.1
Feature Detection and Matching
The goal of feature detection and matching in 3D is the same as in 2D: to find regions
of a scan that can be reliably, unambiguously matched with scans of the same scene
from different perspectives. These feature matches can subsequently be used to ini-
tializeor aid in registration, as described in thenext section.However, thenatureof 3D
data requires us to rethink the criteria for what makes a “good” feature. Figure 8.29 a,
Search WWH ::




Custom Search