Graphics Reference
In-Depth Information
imaging conditions, and should be taken into account when determining if a detec-
tion is repeated. Figure 4.21 b illustrates this more stringent test. A detection is
considered repeated if the area of intersection of the two regions is sufficiently large
compared to the area of their union (e.g., above sixty percent).
Mikolajczyk et al. [ 329 ] surveyed the affine-covariant feature detectors discussed
in Section 4.1 , and tested themwith respect to viewpoint and scale change, blurring,
JPEG compression, and illumination changes on a varied set of images. Their gen-
eral conclusions were that the Hessian-Affine and MSER detectors had the highest
repeatability under the various conditions, followed by the Harris-Affine detector.
In general, Hessian-Affine and Harris-Affine produced a larger number of detected
pairs than the other algorithms. They then used the SIFT descriptor as the basis for
matching features from each detector, computing a matching score as
M
MS
=
(4.42)
min
(
N 1 , N 2
)
where M is the number of correct nearest-neighbor matches computed using
Euclidean distance between the descriptors. They generally concluded that the
Hessian-Affine and Harris-Affine detectors produced a large number of matches (but
with a relatively high false alarm rate), while MSER produced a lower number of
matches (but with a low false alarm rate).
Mikolajczyk and Schmid [ 328 ] followed up with a more comprehensive evalua-
tion of feature descriptors, considering combinations of the Harris-Laplace/Affine
and Hessian-Laplace/Affine detectors with most of the descriptors discussed in
Section 4.2 . They investigated the same changes in imaging conditions, comput-
ing the precision and recall of each detector/descriptor combination as functions of
a changing parameter (e.g., the rotation angle between the images). Here, precision
and recall are defined as
# correct matches
# total matches
# correct matches
# true correspondences
=
=
precision
recall
(4.43)
where the correct matches and true correspondences are determined from the
repeatability score and region overlapmeasure defined previously. A good descriptor
should have high precision — that is, few false matches — and high recall — that is,
few matches that are present in the detector results but poorly represented by the
descriptor. Their general conclusions, independent of the detector used, were that
the GLOH and SIFT descriptors had the best performance. Shape contexts and PCA-
SIFT also performed well. This study also confirmed the usefulness of the nearest
neighbor distance ratio for matching SIFT descriptors.
Moreels and Perona [ 336 ] undertook a similar controlled evaluation of detec-
tor/descriptor combinations, for the specific problem of matching features in
close-up images of 3D objects with respect to viewpoint and lighting changes. They
found that Hessian-Affine and DoG detectors with SIFT descriptors had consis-
tently high performance for viewpoint changes. MSER and shape contexts, which
performed well on planar scenes in [ 328 ], were found to have only average perfor-
mance for matching 3D objects. The Harris-Affine detector with the SIFT descriptor
 
Search WWH ::




Custom Search