Graphics Reference
In-Depth Information
a viewer-centred representation of the image data. The views are generated automat-
ically from a three-dimensional object model by rendering, and the pose parameters
of each view are stored in a table. Edge templates are computed for each view.
For the input image, the best-fitting template and thus the corresponding pose pa-
rameters are determined by a template matching procedure. The difficult trade-off
between the tessellation constant, i.e. the difference between the pose parameters of
neighbouring views, and the accuracy of pose estimation is alleviated by a technique
for hierarchical template matching (Gavrila and Philomin, 1999 ).
The input image first undergoes an edge detection procedure. A distance trans-
form (DT) then converts the segmented binary edge image into what is called a
distance image. The distance image encodes the distance in the image plane of each
image point to its nearest edge point. If we denote the set of all points in the image
as A
S a 1 ,..., S a N }
S b 1 ,..., S b M }
={
and the set of all edge points as B
={
with
A , then the distance d( S a n ,B) for point S a n is given by
d S a n ,B =
B
mi m
S b m ,
S a n
(2.1)
where
is a norm on the points of A and B (e.g. the Euclidean norm). For
numerical simplicity we use the chamfer-2-3 metric (Barrow, 1977 ) to approximate
the Euclidean metric.
The chamfer distance D C (T , B) between an edge template consisting of a set of
edge points T
...
S t 1 ,..., S t Q }
={
with T
A and the input edge image is given by
Q
d S t n ,B .
1
Q
D C (T , B)
=
(2.2)
n
=
1
A correspondence between a template and an image region is assumed to be present
once the distance measure ('dissimilarity') D(T,B) becomes smaller than a given
threshold value θ . To reduce false detections, the distance measure was extended to
include oriented edges (Gavrila and Philomin, 1999 ).
In order to recognise an object with unknown rotation and translation, a set of
transformed templates must be correlated with the distance image. Each template is
derived from a certain rotation of the three-dimensional object. In previous work, a
uniform tessellation often involved a difficult choice for the value of the tessellation
constant. If one chooses a relatively large value, the views that lie 'in between' grid
points on the viewing sphere are not properly represented in the regions where the
aspect graph is undergoing rapid changes. This decreases the accuracy of the mea-
sured pose angles. On the other hand, if one chooses a relatively small value for the
tessellation constant, this results in a large number of templates to be matched on-
line; matching all these templates sequentially is computationally intensive and pro-
hibitive to any real-time performance. The difficult trade-off regarding tessellation
constant is alleviated by a technique for hierarchical template matching, introduced
by Gavrila and Philomin ( 1999 ). That technique, designed for distance transform-
based matching, in an offline stage derives representation which takes into account
the structure of the given distribution of templates, i.e. their mutual degrees of sim-
ilarity. In the online stage, this approach allows an optimisation of the matching
Search WWH ::




Custom Search