Graphics Reference
In-Depth Information
ters and observed image features by an accumulator structure in which a counter is
incremented for each pose consistent with an observed feature, such that after re-
garding all observed features local maxima in the accumulator correspond to pose
hypotheses). The third evaluated method is the gradient sign table approach intro-
duced by Krüger ( 2007 ) (cf. Sect. 2.1.1 ). The three algorithms are compared using
the trinocular Digiclops camera system. The pixel resolution of the regarded images
is comparable to that of the images examined earlier in this section. For the exam-
ined methods, Krüger ( 2007 ) obtains mean accuracy values of about 1 -2 for the
rotation angles and 3-7 mm for the depth. In contrast, the monocular technique by
von Bank et al. ( 2003 ) examined above does not estimate the depth but assumes it
to be known. No values are given by Krüger ( 2007 ) for the translational accuracy in
the directions parallel to the image plane. Taken as a whole, the accuracies of the es-
timated pose angles are comparable for the monocular template matching approach
that estimates five pose parameters and the extended, trinocular template matching
technique that determines all six pose parameters.
As another pose estimation approach, the system of Bachler et al. ( 1999 )(cf.
Sect. 2.1.1.2 ) is regarded for comparison. The accuracy of the estimated rotation
angle around the optical axis is better than 3 . For the translational pose parameters
parallel to the image plane, which are estimated using a CAD model of the object,
no accuracy values are given.
The system of Chang et al. ( 2009 ) described in Sect. 2.1.1.2 relies on syntheti-
cally generated images of the object showing specular reflections and on the optical
flow associated with specular reflections. Using synthetic images as ground truth, an
average translational accuracy parallel to the image plane of better than 0 . 5mmis
obtained, while no depth value is determined. The rotational accuracy typically cor-
responds to 2 . 5 -5 . The presented qualitative evaluation on real images suggests
that these accuracy values are also realistic for real images.
The system of Collet et al. ( 2011 ) described in Sect. 2.1.1.2 is based on a single
or several calibrated images of the scene. A resolution of about 1 mm per pixel can
be inferred from the given image size of 640
480 pixels and the apparent size
of the objects in the images shown. In a monocular setting, the translational pose
estimation error is below 15 mm and decreases to below 5 mm when three images
are used. The rotational accuracy corresponds to about 6 for the monocular setting
and to about 3 . 5 -6 for three images, depending on how the pose estimation results
inferred from the individual images are combined.
×
6.1.2 Pose Refinement
This section examines the application of the appearance-based approach described
in Sect. 5.6 to three-dimensional pose estimation of automotive parts. The descrip-
tion in this section is adopted from Barrois and Wöhler ( 2007 ), whose method com-
bines monocular photopolarimetric, edge, and defocus information. As a first ex-
ample, the oil cap is regarded again (cf. Fig. 6.5 ). The experimental setup is the
Search WWH ::




Custom Search