Graphics Reference
In-Depth Information
the object contour projected into the image, applying the iterative closest point (ICP)
algorithm by Zhang ( 1999b ), which is especially designed for analysing free-form
surfaces (cf. also Sect. 2.3 ).
Rosenhahn et al. ( 2006 ) compare the ICP algorithm for three-dimensional pose
estimation in stereo image pairs with a level set approach formulated in terms of a
computational technique from the field of optical flow analysis. The pose estimation
is based on silhouettes. A quantitative evaluation of the two methods and their com-
bination is performed. It demonstrates that the highest performance is achieved by a
combination of both approaches, especially when regarding the convergence radius,
i.e. the ability to converge towards the true pose from a considerably different initial
pose.
The method of von Bank et al. ( 2003 ), which is described in detail in Sect. 2.1.2 ,
is extended by Krüger ( 2007 ) to a multiocular setting characterised by three cali-
brated cameras. The accuracy of the three-dimensional pose estimation results ob-
tained by Krüger ( 2007 ) is discussed in Sect. 6.1 .
The object recognition and pose estimation system proposed by Collet et al.
( 2011 ) for the manipulation of objects by a robot relies on one or several calibrated
images of the scene. Three-dimensional models of the objects are constructed using
a structure from motion approach, where a model is associated with a set of features.
The three-dimensional scene reconstruction and the estimation of the pose param-
eters are performed simultaneously based on the 'iterative clustering estimation'
algorithm, where features detected in the images are clustered and associated with
objects and their corresponding pose parameters in an iterative manner by employ-
ing robust optimisation methods. Collet et al. ( 2011 ) use the mean-shift algorithm
as proposed by Cheng ( 1995 ) for clustering. The resulting object hypotheses are
again clustered, using the 'projection clustering' approach, and a pose refinement is
applied, which yields the objects in the scene with their associated pose parameters.
An optimised hardware architecture using graphical processing units for the compu-
tationally complex parts of the algorithm, such as the feature detection step, results
in cycle times of about two seconds for a real-world image sequence with 60 objects
per image. The pose estimation accuracy of the system is discussed in Sect. 6.1 .
2.1.2 Template-Based Pose Estimation
Many industrial applications of pose estimation methods for quality inspection pur-
poses impose severe constraints on the hardware to be used with respect to robust-
ness and easy maintenance. Hence, it is often not possible to utilise multiocular cam-
era systems since they have to be recalibrated regularly, especially when the sensor
unit is mounted on an industrial robot. As a consequence, employing a monocular
camera system may be favourable from a practical point of view, while nevertheless
a high pose estimation accuracy is required to detect subtle deviations between the
true and the desired object pose.
The presentation in this section is adopted from von Bank et al. ( 2003 ). The
appearance-based 2D-3D pose estimation method described in this section involves
Search WWH ::




Custom Search