Speeding Up Structure from Motion on Large Scenes Using Parallelizable Partitions - Advanced Concepts for Intelligent Vision Systems

Information Technology Reference

In-Depth Information

high percentage of overlap, making this possible. However the computation time

required by structure from motion is approximately quadratic in the amount of

points in the scene [15], and so does not scale well to very large scenes.

If we could limit the structure from motion optimization to small scenes,

and then later combine all small scenes into one global scene, the downsides

of the quadratic behaviour would be largely avoided. Additionally, every small

scene can be optimized in parallel, further increasing the reconstruction speed.

In this paper we work out the details involved in splitting and recombining

a large scene, and evaluate how the final result changes with respect to the

original, slow reconstruction. We do not compare with any ground truth, as the

absolute accuracy of the 3D model obtained by various reconstruction methods

has already been evaluated in [11].

Previous work on speeding up structure from motion for large scenes includes

sub-sampling and hierarchical decomposition [10]. While effective, the downside

here is that for very large scenes, the required time is still quadratic. The methods

presented in [14] and [9] also use partial reconstructions to speed up the final

result. However these techniques uses the Hessian of the reprojection error and

its eigenvector to split up the global scene, which implies that the scene must be

already approximately reconstructed. Again, this requires a lot of time for large

scenes consisting of many pictures.

The rest of this paper is arranged as follows. First we briefly explain structure

from motion and bundle adjustment. Next we present the theory behind our

method to speed up the computations, followed by experiments to evaluate the

accuracy on practical data. Weendwithaconclusion.

2 Structure from Motion

2.1

3D Reconstruction with Bundle Adjustment

We first discuss the problem of the 3D reconstruction of a scene, based on mul-

tiple 2D views from different locations. Given n 3D points which are observed

by m cameras, we can write as x ij the projection of point i in camera j ,with

i =1 ..n and j =1 ..m . The projection from a 3D point X to a 2D point x can

be written compactly in homogeneous coordinates as

λ x = MX

(1)

with λ a scale factor and M the 3 x 4 homogeneous camera matrix, with 11

independent parameters [6]. This projective camera model can be simplified to

Euclidean cameras, for which M can be decomposed into

M = K [ R

|

t ]

(2)

where t is augmented to R .Here K is the 3 x 3 intrinsic calibration matrix

containing the camera's optical properties, such as focal length, principal point

and aspect ratio. The camera's extrinsic parameters are given by R ,a3x3

Advanced Concepts for Intelligent Vision Systems

Search WWH ::

Custom Search

Home