Information Technology Reference
In-Depth Information
rotation matrix, and t , a 3 x 1 translation vector. From this we calculate the
camera's position as
R T t . Furthermore we can incorporate the optical image
distortion caused by imperfect lenses into (1) by writing λr ( x )= MX ,with r ()
the distortion function. This function maps a correct (undistorted) image to
its distorted version. e.g. r ( x )=1+ k 1
4 . Many more distortion
functions exist [2], as well as camera calibration techniques to determine the
parameters of the distortion function in advance [5,3].
The 3D reconstruction from multiple views now comes down to finding val-
ues for all cameras M and all 3D points X so that the difference between the
computed position of x from (1) and the measured position of x is minimized,
2 + k 2
x
x
n
m
d ( M j X i , x ij ) 2
min
M j , X i
(3)
i
=1
j
=1
with d ( x , y ) the Euclidean distance. The summation of all distances is called
the reprojection error , and (3) is typically minimized using bundle adjustment
[15]. Due to the sparse nature of the equations (the parameters of individual 3D
points X and cameras M do not interact) several optimizations for speed can
be applied. Work on this by Lourakis et al. resulted in the open source software
package sba [7], using a modified version of the Levenberg-Marquardt algorithm
for the iterative optimization. Still, the required time for optimization is at least
quadratic in the number of 3D points, although exact timings depend on the
scene under consideration [7]. When all matrices M are known, a dense recon-
struction of the scene can be created, using for example the methods described
in [4].
2.2 Finding Corresponding Points
The bundle adjustment requires knowledge of points x ij . In other words, we must
identify points in all images that correspond to the same physical location or 3D
point. Several algorithms exists that do this, most notably SIFT [8] and SURF
[1]. These methods extract feature points that are likely to be recognized in an-
other image, and then match those feature points based on Euclidean distances
between their associated feature descriptors. The amount of point matches that
are generated, as well as their reliability, depend on the input images and on
several parameters in the algorithms. In general, regions without distinctive fea-
tures will contain less feature points. While this makes the found features more
robust, on a large scene it can also give rise to regions without any feature points.
This is not desirable, so we tweak the parameters of the feature point detection
algorithm to give roughly the same amount of feature points for all input images
(e.g. 500 points). If too much features are found, the parameters are tightened,
and vice versa.
2.3 Avoiding Local Optima
Due to its nonlinearity, the reprojection error defined by (3) contains many
local optima. This problem gets worse when some of the pointmatches found
 
Search WWH ::




Custom Search