Three-Dimensional Data Acquisition - Computer Vision for Visual Effects

Graphics Reference

In-Depth Information

accelerated research in stereo and optical flow. The two main evaluation datasets

are based on roughly constant-reflectance models approximately ten centimeters on

a side, captured from a hundreds of viewpoints distributed on a hemisphere (see

Figure 8.27 a). Strecha et al. [ 470 ] later contributed a benchmarking dataset for large-

scale multi-view stereo algorithms, using high-resolution images of buildings many

meters on a side.

It's important to note that whilemodernmulti-view stereo results are qualitatively

quite impressive, and quantitatively (i.e., sub-millimeter) accurate for small objects,

purely image-based techniques are not yet ready to replace LiDAR systems for highly

accurate, large-scale 3D data acquisition. For example, Strecha et al. [ 470 ] estimated

that for large outdoor scenes, only forty to sixty percent of the 3Dpoints for a topMVS

algorithm applied to high-resolution images were within three standard deviations

of the noise level of a LiDAR scanner, while ten to thirty percent of the ground truth

measurements were missing or wildly inaccurate in the MVS result. For this reason,

multi-view stereo papers typically use LiDAR or structured light results as the ground

truth for their algorithmcomparisons. Multi-view stereo algorithms can also be quite

computationally expensive and hence slow, another drawback compared to near-

real-time structured light systems.

8.3.1

Volumetric Methods

Volumetric methods for 3D data acquisition share similarities with the problem of

finding the visual hull from silhouettes of an object, as discussed in Section 7.7.3 .

As in the visual hull problem, we require a set of accurately calibrated cameras, and

represent 3D space as a set of occupied voxels. The finer the voxelization of the space

is, themore accurate the 3D reconstructionwill be. However, in themulti-view stereo

problem, we also use the colors of the pixels inside the object silhouette — not only

to color the voxels on the resulting 3D surface but also to remove voxels inconsistent

with the observed color in some source image. Therefore, the result of a volumetric

multi-view stereo method is usually a subset of the visual hull with colors associated

to each surface voxel, so that rendering the voxels from the point of view of each

camera should produce a similar image to what was actually acquired.

The basic voxel coloring idea originated in a paper by Seitz and Dyer [ 435 ]. In this

approach, a plane of voxels is swept through space along the direction of increasing

distance from the cameras. A special camera configuration — for example, that no

scene point is contained within the convex hull of the camera centers — is required.

Each voxel in the current plane is projected to the images, and the colors of the

corresponding image pixels are evaluated for consistency. If the colors are all suf-

ficiently similar (e.g., the standard deviation of the set of color measurements is

sufficiently small), the voxel is called photo-consistent , kept and colored; otherwise,

it is removed. 16 The photo-consistency idea is illustrated in Figure 8.24 . Voxels along

lines of sight “behind” (i.e., in a depth plane after) a colored voxel are not considered

in subsequent steps.

16 This method, and multi-view stereo methods in general, perform best on Lambertian surfaces, as

opposed to specular or translucent ones. Of course, the same is true for LiDAR and structured light

methods.

Computer Vision for Visual Effects

Search WWH ::

Custom Search

Home