Three-Dimensional Data Acquisition - Computer Vision for Visual Effects

Graphics Reference

In-Depth Information

triangulation. This mesh is then evolved and adapted using a regularized partial

differential equation.

Surface deformation methods generally have a bias toward computing surfaces

withminimal surface area or bending energy, which can have the effect of smoothing

away sharp details. On the other hand, the continuity of the mesh/level-set enables

the 3D reconstruction to span flat, untextured regions on the underlying surface that

are challenging for the methods we discuss next. Thus, surface deformation results

typically don't contain missing regions.

8.3.3

Patch-Based Methods

The multi-view stereo methods discussed so far are well suited to a single object in

an uncluttered scene (e.g., a small statue rotated on a turntable in front of a camera),

especially in situations where silhouettes can be extracted to estimate the visual hull.

However, such techniques don't scale well to reconstructing 3D environments (e.g.,

movie sets) that contain many disconnected surfaces at different depths, as well as

clutter like fences and trees. We now turn to patch-based methods , which impose

no assumptions about the structure of the scene and are much better suited for these

types of problems. The scene is modeled as a collection of small 3D planar patches

initially created by triangulating feature matches and then grown to cover surfaces of

the scene based on the evidence from surrounding pixels in the source images.

Here, we overview the patch-based multi-view stereo ( PMVS ) algorithm of

Furukawa and Ponce [ 160 ], one of the best-known and top-performing multi-view

stereo algorithms. 19 A patch p is defined as a 3D rectangle with center c

(

p

)

and unit

normal vector n

5 pixel) window in a specified

reference image. The goal is to generate a large number of such patches that cover

the scene as well as possible and are consistent with the source images.

We begin by detecting DoG and Harris features in each source image, as discussed

in Chapter 4 . To ensure uniform coverage, a coarse grid is overlaid on each image

and the strongest four features are selected in each grid square. For each feature in

a given image, the corresponding epipolar lines in other images are searched for a

high-qualitymatch, in order of increasing distance from the camera that acquired the

reference image. If a good match is found, a patch p is generated with initial center

computed by triangulating the feature match, and initial normal given by the unit

vector pointing toward the reference camera. The set of images in which p is visible,

denoted V

(

p

)

, sized to project to a small (e.g., 5

×

, is initialized as those images for which the angle between the patch

normal and viewing angle is sufficiently small.

We impose a regular grid on p , project it to all the images in V

(

p

)

, and score each

image based on the normalized cross-correlation of the intensities at the projected

locations, as illustrated in Figure 8.26 a. Then images that match poorly are removed

from V

(

p

)

, and the patch is assigned anoverall score basedon the average normalized

cross-correlation of the remaining samples. Finally, the center and normal of each

patch are simply optimized by minimizing this overall score function with respect to

these parameters.

(

p

)

19 Apublicly available implementation is available at http://grail.cs.washington.edu/software/pmvs/ .

Computer Vision for Visual Effects

Search WWH ::

Custom Search

Home