Information Technology Reference
In-Depth Information
high percentage of overlap, making this possible. However the computation time
required by structure from motion is approximately quadratic in the amount of
points in the scene [15], and so does not scale well to very large scenes.
If we could limit the structure from motion optimization to small scenes,
and then later combine all small scenes into one global scene, the downsides
of the quadratic behaviour would be largely avoided. Additionally, every small
scene can be optimized in parallel, further increasing the reconstruction speed.
In this paper we work out the details involved in splitting and recombining
a large scene, and evaluate how the final result changes with respect to the
original, slow reconstruction. We do not compare with any ground truth, as the
absolute accuracy of the 3D model obtained by various reconstruction methods
has already been evaluated in [11].
Previous work on speeding up structure from motion for large scenes includes
sub-sampling and hierarchical decomposition [10]. While effective, the downside
here is that for very large scenes, the required time is still quadratic. The methods
presented in [14] and [9] also use partial reconstructions to speed up the final
result. However these techniques uses the Hessian of the reprojection error and
its eigenvector to split up the global scene, which implies that the scene must be
already approximately reconstructed. Again, this requires a lot of time for large
scenes consisting of many pictures.
The rest of this paper is arranged as follows. First we briefly explain structure
from motion and bundle adjustment. Next we present the theory behind our
method to speed up the computations, followed by experiments to evaluate the
accuracy on practical data. Weendwithaconclusion.
2 Structure from Motion
2.1
3D Reconstruction with Bundle Adjustment
We first discuss the problem of the 3D reconstruction of a scene, based on mul-
tiple 2D views from different locations. Given n 3D points which are observed
by m cameras, we can write as x ij the projection of point i in camera j ,with
i =1 ..n and j =1 ..m . The projection from a 3D point X to a 2D point x can
be written compactly in homogeneous coordinates as
λ x = MX
(1)
with λ a scale factor and M the 3 x 4 homogeneous camera matrix, with 11
independent parameters [6]. This projective camera model can be simplified to
Euclidean cameras, for which M can be decomposed into
M = K [ R
|
t ]
(2)
where t is augmented to R .Here K is the 3 x 3 intrinsic calibration matrix
containing the camera's optical properties, such as focal length, principal point
and aspect ratio. The camera's extrinsic parameters are given by R ,a3x3
 
Search WWH ::




Custom Search