Graphics Reference
In-Depth Information
depth does not change. The resulting three-dimensional pose is computed with the
MOCCD algorithm, where the pose of the best matching rectangle is used for ini-
tialisation.
The second hypothesis is computed using a two-dimensional correlation-based
pose refinement algorithm. The last valid three-dimensional pose vector T is used
to project a rectangle circumscribing the forearm into the image of camera 1. This
rectangle is the starting position of a two-dimensional greedy optimisation which
searches the centre and rotation of the rectangle with the highest normalised cross-
correlation compared to the reference template of the last valid time step. A three-
dimensional pose is inferred from the best matching rectangle, which is used as an
initialisation for the MOCCD algorithm.
Relying on the criteria of the verification module, the better of the two hypotheses
is determined. If the better hypothesis passes the verification module, the tracking
is continued using the corresponding three-dimensional pose.
2.3 Point Cloud Segmentation Approaches
For the point-based three-dimensional pose estimation methods outlined in Sect. 2.1 ,
explicit knowledge about correspondences between three-dimensional model points
and two-dimensional image points is required. The problem of estimating the pose
of an object is then equivalent to that of determining exterior camera orientation (cf.
Sect. 1.4 ). In contrast, appearance-based pose estimation approaches like those de-
scribed in Sects. 2.1 and 2.2.1.2 do not rely on explicit correspondences but, instead,
minimise the difference between the expected appearance of the object according to
the estimated pose and the true object appearance.
In many scenarios, a three-dimensional description of the scene is given as a
point cloud obtained e.g. by stereo image analysis (cf. Sect. 1.2 ) or with active sen-
sors such as laser scanning devices. Initially, this point cloud contains no informa-
tion about objects in the scene. In such cases, an important task is the segmentation
of the point cloud into objects, either without using a priori information or based on
(weak or strong) model assumptions about the objects found in the scene. A scene
segmentation without a priori knowledge can be achieved by clustering methods
(Press et al., 2007 ; Marsland, 2009 ), while an important approach to model-based
segmentation of point clouds is the iterative closest point (ICP) algorithm (Besl and
McKay, 1992 ; Zhang, 1992 ). Similar methods have been developed in the domain
of photogrammetry, e.g. to extract human-made objects such as buildings from topo-
graphic maps or terrestrial laser scanner data (Rottensteiner et al., 2005 , 2006 ). We
regard in detail a method introduced by Schmidt et al. ( 2007 ) for the detection and
tracking of objects in a three-dimensional point cloud with motion attributes gen-
erated with the spacetime stereo approach described in Sect. 1.5.2.5 . This method
involves a clustering step relying on the spatial distribution and motion behaviour
of the scene points, a subsequent model-fitting stage, and a kernel particle filter for
tracking the detected objects.
Search WWH ::




Custom Search