Graphics Reference
In-Depth Information
While in most of this chapter, we assumed the brightness constancy assumption
roughly holds, obtaining dense correspondence between images of the same scene
with different exposures is the basis of
high dynamic-range imaging (HDRI)
, used
in the visual effects industry to obtain high-quality lighting information about an
environment [
386
]. However, HDRI images are typically acquired from the same
camera position, so there is usually little need to compensate for camera motion.
Mutual information [
543
] is apowerful tool for image registrationwhen thebrightness
constancy assumption is violated but correspondences between images can still be
defined; this approach is quite common for aligning medical imagery from different
modalities.
As mentioned in Section
5.8
, one way to view image-based rendering problems is
as interpolating a continuous function from its samples. Formally, each source image
provides samples of the
plenoptic function
that defines the scene intensity visible to
an observer at every possible location, orientation, time, and light wavelength [
2
]. To
synthesize a new view, we need different samples of the plenoptic function that must
be estimated from the given information. For example, instead of the configurations
required for view morphing, McMillan and Bishop [
322
] and Shum and He [
446
]
discussed view synthesis schemes in which the source and synthesized views came
from cylindrical panoramas of the scene taken from different locations.
Whenmultiple cameras are available, image-based view synthesis techniques can
be applied to generate a wider range of virtual viewpoints (i.e., the virtual camera
center can be located anywhere within the convex hull of the source camera cen-
ters). For example, Zitnick et al. [
581
] used an arc of eight cameras for high-quality
view synthesis. The lumigraph [
175
] and light field [
274
] are approaches that use
hundreds or even thousands of closely spaced images of an object for view synthe-
sis. We will return to multicamera methods for 3D acquisition and view synthesis in
Section
8.3
.
A less flashy but important type of view synthesis is
video stabilization
, the
automatic removal of annoying shaky motion from handheld video. For example,
Matsushita et al. [
315
] proposed a video stabilization algorithm that begins with an
optical flow field between each pair of adjacent images (defined as an affine transfor-
mation plus a local motion field estimatedwith Lucas-Kanade). A stabilized sequence
is createdby smoothing the chainof affine transformations to remove high-frequency
motions. The resulting images are deblurred, and holes in the synthesized views are
filledby inpainting the localmotionfield. Gleicher and Liu [
169
] refined this approach
with a technique they called re-cinematography, which tries to replace the camera
motion with one that follows cinematic conventions, such as a slow, steady pan. In
general, video stabilization techniques are improved by reconstructing the camera
motion in 3D (e.g., [
293
,
294
]) instead of operating directly on 2D images. The next
chapter discusses this matchmoving problem.
Shechtman et al. [
440
] proposed a generalization ofmorphing that gradually trans-
forms a source image into a target image, even when the images are so different that
a natural correspondence between them is impossible to define (e.g., between a face
and an image of clouds). The underlying concept is the bidirectional similarity mea-
sure from Section
3.5.3
, which encourages each patch in an intermediate image to
resemble a patch from either the source or the target, depending on the fraction of
the way along the morph.