Three-Dimensional Pose Estimation and Segmentation Methods - 3D Computer Vision: Efficient Methods and Applications

Graphics Reference

In-Depth Information

A further important classical approach to 2D-3D pose estimation, introduced

by Lamdan and Wolfson ( 1988 ), is geometric hashing. An object is represented in

terms of its geometric features, where coordinate systems are constructed based on

specific features, especially points. For example, for a set of points lying in a plane,

an orthogonal coordinate system is given by any pair of points on the plane, and

the other points can be expressed in that coordinate system. All point pair-specific

sets of coordinates ('models') along with the points that define them ('basis pair')

are stored in what is called the 'hash table'. Non-planar objects are assumed to

be composed of planar parts by Lamdan and Wolfson ( 1988 ). The pose estimation

process is then performed based on a random selection of pairs of observed scene

points, where all other scene points are again expressed in the correspondingly de-

fined coordinate system. Each selected pair of scene points increments a counter

for one such model and basis pair stored in the hash table. All entries of the hash

table for which the counter is larger than a previously defined minimum value are

regarded as occurrences of the modelled object in the observed scene, where the

parameters of the corresponding coordinate system denote the pose parameters, re-

spectively.

An important edge-based approach to 2D-3D pose estimation is proposed by

Lowe ( 1991 ). The object model is assumed to be composed of plane parts of polyg-

onal shape as an approximation to the true surface shape. The contours of the object

in the image are obtained by projecting the silhouette of the object into the im-

age plane. Edge segments are extracted from the utilised greyscale image with the

Canny edge detector (Canny, 1986 ). The error function is given by the sum of the

perpendicular distances in the image plane between the extracted edge segments

and the nearest projected object contour line. The pose parameters, including in-

ternal degrees of freedom of the object, are obtained by least-mean-squares min-

imisation of the error function with the Gauß-Newton method or alternatively with

the Levenberg-Marquardt algorithm (Press et al., 2007 ), where the latter generally

displays a more robust convergence behaviour.

A similar 2D-3D pose estimation approach based on point and line correspon-

dences is described by Phong et al. ( 1996 ). They propose a quadratic error func-

tion by expressing the pose parameters in terms of a quaternion. Minimisation of

the error function to determine the pose parameters is performed using a trust-

region method, which yields a superior performance when compared with the Gauß-

Newton method.

In the framework developed by Grebner ( 1994 ), 2D-3D pose estimation of in-

dustrial parts is performed based on edges and corner points, where the param-

eter search, corresponding to the minimisation of an appropriately chosen cost

function, is performed using the A ∗

algorithm (cf. e.g. Sagerer, 1985 for an

overview).

2.1.1.2 Appearance-Based Pose Estimation Methods

A different class of methods consists of the appearance-based approaches, which

directly compare the observed image with the appearance of the object at different

3D Computer Vision: Efficient Methods and Applications

Search WWH ::

Custom Search

Home