Three-Dimensional Pose Estimation and Segmentation Methods - 3D Computer Vision: Efficient Methods and Applications

Graphics Reference

In-Depth Information

the object contour projected into the image, applying the iterative closest point (ICP)

algorithm by Zhang ( 1999b ), which is especially designed for analysing free-form

surfaces (cf. also Sect. 2.3 ).

Rosenhahn et al. ( 2006 ) compare the ICP algorithm for three-dimensional pose

estimation in stereo image pairs with a level set approach formulated in terms of a

computational technique from the field of optical flow analysis. The pose estimation

is based on silhouettes. A quantitative evaluation of the two methods and their com-

bination is performed. It demonstrates that the highest performance is achieved by a

combination of both approaches, especially when regarding the convergence radius,

i.e. the ability to converge towards the true pose from a considerably different initial

pose.

The method of von Bank et al. ( 2003 ), which is described in detail in Sect. 2.1.2 ,

is extended by Krüger ( 2007 ) to a multiocular setting characterised by three cali-

brated cameras. The accuracy of the three-dimensional pose estimation results ob-

tained by Krüger ( 2007 ) is discussed in Sect. 6.1 .

The object recognition and pose estimation system proposed by Collet et al.

( 2011 ) for the manipulation of objects by a robot relies on one or several calibrated

images of the scene. Three-dimensional models of the objects are constructed using

a structure from motion approach, where a model is associated with a set of features.

The three-dimensional scene reconstruction and the estimation of the pose param-

eters are performed simultaneously based on the 'iterative clustering estimation'

algorithm, where features detected in the images are clustered and associated with

objects and their corresponding pose parameters in an iterative manner by employ-

ing robust optimisation methods. Collet et al. ( 2011 ) use the mean-shift algorithm

as proposed by Cheng ( 1995 ) for clustering. The resulting object hypotheses are

again clustered, using the 'projection clustering' approach, and a pose refinement is

applied, which yields the objects in the scene with their associated pose parameters.

An optimised hardware architecture using graphical processing units for the compu-

tationally complex parts of the algorithm, such as the feature detection step, results

in cycle times of about two seconds for a real-world image sequence with 60 objects

per image. The pose estimation accuracy of the system is discussed in Sect. 6.1 .

2.1.2 Template-Based Pose Estimation

Many industrial applications of pose estimation methods for quality inspection pur-

poses impose severe constraints on the hardware to be used with respect to robust-

ness and easy maintenance. Hence, it is often not possible to utilise multiocular cam-

era systems since they have to be recalibrated regularly, especially when the sensor

unit is mounted on an industrial robot. As a consequence, employing a monocular

camera system may be favourable from a practical point of view, while nevertheless

a high pose estimation accuracy is required to detect subtle deviations between the

true and the desired object pose.

The presentation in this section is adopted from von Bank et al. ( 2003 ). The

appearance-based 2D-3D pose estimation method described in this section involves

Search WWH ::

Custom Search

Home