Three-Dimensional Pose Estimation and Segmentation Methods - 3D Computer Vision: Efficient Methods and Applications

Graphics Reference

In-Depth Information

depth does not change. The resulting three-dimensional pose is computed with the

MOCCD algorithm, where the pose of the best matching rectangle is used for ini-

tialisation.

The second hypothesis is computed using a two-dimensional correlation-based

pose refinement algorithm. The last valid three-dimensional pose vector T is used

to project a rectangle circumscribing the forearm into the image of camera 1. This

rectangle is the starting position of a two-dimensional greedy optimisation which

searches the centre and rotation of the rectangle with the highest normalised cross-

correlation compared to the reference template of the last valid time step. A three-

dimensional pose is inferred from the best matching rectangle, which is used as an

initialisation for the MOCCD algorithm.

Relying on the criteria of the verification module, the better of the two hypotheses

is determined. If the better hypothesis passes the verification module, the tracking

is continued using the corresponding three-dimensional pose.

2.3 Point Cloud Segmentation Approaches

For the point-based three-dimensional pose estimation methods outlined in Sect. 2.1 ,

explicit knowledge about correspondences between three-dimensional model points

and two-dimensional image points is required. The problem of estimating the pose

of an object is then equivalent to that of determining exterior camera orientation (cf.

Sect. 1.4 ). In contrast, appearance-based pose estimation approaches like those de-

scribed in Sects. 2.1 and 2.2.1.2 do not rely on explicit correspondences but, instead,

minimise the difference between the expected appearance of the object according to

the estimated pose and the true object appearance.

In many scenarios, a three-dimensional description of the scene is given as a

point cloud obtained e.g. by stereo image analysis (cf. Sect. 1.2 ) or with active sen-

sors such as laser scanning devices. Initially, this point cloud contains no informa-

tion about objects in the scene. In such cases, an important task is the segmentation

of the point cloud into objects, either without using a priori information or based on

(weak or strong) model assumptions about the objects found in the scene. A scene

segmentation without a priori knowledge can be achieved by clustering methods

(Press et al., 2007 ; Marsland, 2009 ), while an important approach to model-based

segmentation of point clouds is the iterative closest point (ICP) algorithm (Besl and

McKay, 1992 ; Zhang, 1992 ). Similar methods have been developed in the domain

of photogrammetry, e.g. to extract human-made objects such as buildings from topo-

graphic maps or terrestrial laser scanner data (Rottensteiner et al., 2005 , 2006 ). We

regard in detail a method introduced by Schmidt et al. ( 2007 ) for the detection and

tracking of objects in a three-dimensional point cloud with motion attributes gen-

erated with the spacetime stereo approach described in Sect. 1.5.2.5 . This method

involves a clustering step relying on the spatial distribution and motion behaviour

of the scene points, a subsequent model-fitting stage, and a kernel particle filter for

tracking the detected objects.

Search WWH ::

Custom Search

Home