Graphics Reference
In-Depth Information
(a)
image frames
image frames
(b)
keyframes
Figure 6.12. (a) Sequential updating of cameras uses overlapping pairs of images to succes-
sively estimate projective camera matrices. (b) Hierarchical updating uses a subset of keyframes:
images chosen to give wider baselines. Intermediate cameras and scene points can be esti-
mated using resectioning and triangulation. In both cases, triples of images can be used instead,
leveraging the trifocal tensor.
images in the sequence. Beardsley et al. [ 34 ] described this process in detail, taking
into account the problem of maintaining good estimates of the 3D structure used
for resectioning as the sequence gets longer and feature matches enter and leave the
images.
Avidan and Shashua [ 22 ] and Fitzgibbon and Zisserman [ 145 ] described methods
that “thread together” triples of images to estimate the next camera matrix instead of
using pairs of images, so that all the cameras are represented in a common projective
frame. These methods are based on the trifocal tensor ,a3
×
×
3 matrix that relates
feature correspondences in image triples similarly to how the fundamental matrix
relates feature correspondences in image pairs. Methods based on triples of images
are often preferred since the trifocal constraint is stronger, making it easier to reject
outlier feature matches; also, each triple overlaps the previous one by two images,
adding robustness to the solution. Fitzgibbon and Zisserman also described how to
enforce a constraint if an image sequence is known to be closed — that is, the first
and last camera matrices are the same.
When successive images are very close together spatially (which is not unusual),
the decomposition in Equation ( 6.31 ) is unstable; that is, the fundamental matrix
may be poorly estimated since t is so small in Equation ( 6.29 ). In these cases, a
global projective transformation may better express the relationship between the
two views (since the motion is nearly pure rotation). Torr et al. [ 495 ] discussed this
problem of degeneracy in calibrating image sequences, and proposed methods for
“surviving” these situations when they are encountered in practice. The key idea is
to incorporate a robust model selection criterion at each frame that decides whether
the relationship between an image pair is better modeled by a fundamental matrix or
a projective transformation. The same problemoccurs when the scene in an image is
primarily comprised of a single, dominant plane; Pollefeys et al. [ 369 ] extended Torr
et al.'s approach to operate on image triples in this situation.
Alternately, we can take keyframes from the sequence that are spatially far enough
apart to enable robust estimation of F , but not so far apart that feature matching
3
Search WWH ::




Custom Search