Matchmoving - Computer Vision for Visual Effects

Graphics Reference

In-Depth Information

(a)

image frames

(b)

keyframes

Figure 6.12. (a) Sequential updating of cameras uses overlapping pairs of images to succes-

sively estimate projective camera matrices. (b) Hierarchical updating uses a subset of keyframes:

images chosen to give wider baselines. Intermediate cameras and scene points can be esti-

mated using resectioning and triangulation. In both cases, triples of images can be used instead,

leveraging the trifocal tensor.

images in the sequence. Beardsley et al. [ 34 ] described this process in detail, taking

into account the problem of maintaining good estimates of the 3D structure used

for resectioning as the sequence gets longer and feature matches enter and leave the

images.

Avidan and Shashua [ 22 ] and Fitzgibbon and Zisserman [ 145 ] described methods

that “thread together” triples of images to estimate the next camera matrix instead of

using pairs of images, so that all the cameras are represented in a common projective

frame. These methods are based on the trifocal tensor ,a3

×

3 matrix that relates

feature correspondences in image triples similarly to how the fundamental matrix

relates feature correspondences in image pairs. Methods based on triples of images

are often preferred since the trifocal constraint is stronger, making it easier to reject

outlier feature matches; also, each triple overlaps the previous one by two images,

adding robustness to the solution. Fitzgibbon and Zisserman also described how to

enforce a constraint if an image sequence is known to be closed — that is, the first

and last camera matrices are the same.

When successive images are very close together spatially (which is not unusual),

the decomposition in Equation ( 6.31 ) is unstable; that is, the fundamental matrix

may be poorly estimated since t is so small in Equation ( 6.29 ). In these cases, a

global projective transformation may better express the relationship between the

two views (since the motion is nearly pure rotation). Torr et al. [ 495 ] discussed this

problem of degeneracy in calibrating image sequences, and proposed methods for

“surviving” these situations when they are encountered in practice. The key idea is

to incorporate a robust model selection criterion at each frame that decides whether

the relationship between an image pair is better modeled by a fundamental matrix or

a projective transformation. The same problemoccurs when the scene in an image is

primarily comprised of a single, dominant plane; Pollefeys et al. [ 369 ] extended Torr

et al.'s approach to operate on image triples in this situation.

Alternately, we can take keyframes from the sequence that are spatially far enough

apart to enable robust estimation of F , but not so far apart that feature matching

3

Search WWH ::

Custom Search

Home