Graphics Reference
In-Depth Information
intensity extrema and trace many rays outward until a photometric measure reaches
an extremum along each ray. An ellipse is fit to the resulting points, producing an
affine-invariant region. Mikolajczyk et al. [ 329 ] found the performance of the two
detectors to be reasonable, but noted that their computational cost was quite high
compared to the Harris/Hessian-Laplace andMSER detectors. Kadir and Brady [ 229 ]
alsoproposedanaffine-invariant detector basedon the idea that good features should
be detected at patches whose intensity distributions have high entropy. However, the
algorithm is extremely slow, and Mikolajczyk et al. [ 329 ] found its performance not
to compare with the algorithms discussed here.
FAST corners were predated by an early approach to fast low-level corner detec-
tion called SUSAN proposed by Smith and Brady [ 461 ]. A disc is centered around a
candidate point and the area of pixels in the disc with intensities similar to the center
pixel is computed. A corner is detected if the area is a local minimum and below
half the disc area. Another approach by Trajkovic and Hedley [ 497 ] uses the same
concept of a circle of pixels around a candidate point, assuming that for some pair
of diametrically opposite points, the intensities must substantially differ from the
center point.
Tell and Carlsson proposed an affine-invariant descriptor based on the line of
intensities connecting pairs of detected features [ 487 ]. However, choosing appropri-
ate feature pairs and obtaining a sufficient number of matches for a given application
can be problematic. Forssén and Lowe [ 149 ] proposed a descriptor for MSERs based
on the region's shape alone, sinceMSER shapes can be quite distinctive. The descrip-
tor is based on applying the SIFT descriptor to the binary patch corresponding to the
MSER.
Lepetit and Fua [ 270 ] proposed a feature recognition algorithmbased on random-
ized trees that assumes that several registered training images of the object to be
detected are available. The idea is to build a library of the expected appearances of
each feature frommany different synthetic viewpoints, and then to build a classifier
that determines the feature (if any) towhich a newpixel patch corresponds. While the
training phase requires some computational effort, the recognition algorithm is fast,
since it only requires the traversal of a precomputed tree. Thus, feature descriptors
are not explicitly formed and compared. The approach was later extended to non-
hierarchical structures called ferns [ 358 ]. Stavens and Thrun [ 466 ] similarly noted
that if a feature matching problem is known to arise from a certain domain (e.g.,
tracking shots of buildings), a machine learning algorithm could be used to tune the
parameters of a detector/descriptor algorithm to obtain the best performance on
domain-specific training data. These kinds of learning algorithms are worth investi-
gation, with the understanding that performance may suffer if input from a different
domain is used.
The popularity of SIFT and its validation as a high-performance descriptor has
led to a variety of extensions (for example, the color versions in Section 4.4 ). One
area of particular interest is the acceleration of the algorithm, since in its original
form descriptor computation and matching was fairly slow, especially compared
to template-based cross-correlation. One approach is to leverage the processing
power of GPUs (e.g., [ 455 ]). Alternately, other groups have stripped out features
of SIFT to make the basic idea viable on a resource-constrained platform like a
mobile phone (e.g., [ 523 ]). In general, any approach that requires the matching
Search WWH ::




Custom Search