Features and Matching - Computer Vision for Visual Effects

Graphics Reference

In-Depth Information

intensities have been normalized as described in Section 4.2.1 . The dimension of the

descriptor is the number of intensity bins times the number of rings. Since there are

no angular subdivisions of the rings, the descriptor is rotation-invariant.

4.2.4

Invariant-Based Descriptors

The first step in SIFT and the similar approaches in the previous section is the rota-

tion of a patch around the estimated feature location so that the dominant gradient

orientation points in a consistent direction. An alternative is to design a descriptor

that's invariant to the rotation of the patch in the first place, bypassing this estimation

and explicit rotation. Such approaches are generally based on invariant functions

of the patch pixels with respect to a class of geometric transformations — typically

rotations or affine transformations. For example, the spin image discussed in the

previous section is a crude rotation-invariant descriptor, but substantial discrimi-

native information may be lost as the intensities from increasingly larger rings are

aggregated into histograms.

Schmid and Mohr [ 430 ] popularized the idea of using differential invariants for

constructing rotation-invariant descriptors. That is, the descriptor is constructed

using combinations of

increasingly higher-order derivatives of the Gaussian-

smoothed image L

(

x , y

)

given by Equation ( 4.16 ). For example, the total intensity

∂ L ( x , y )

∂

∂ L ( x , y )

∂

( x , y )

, sum of squared gradient magnitude ( x , y )

(

x , y

)

, and

sum of Laplacians ( x , y ) ∂

2 L

(

x , y

)

+ ∂

(

x , y

)

y 2 are all invariant to rotation as long as the

sums are taken over equivalent circular regions (such as the ones we obtain at the

end of the affine adaptation process, Figure 4.12 ). A vector of these three quanti-

ties could be used as a descriptor. Since an image can be uniquely defined by its

derivatives (e.g., consider a Taylor series), the more differential invariants we use,

the more uniquely we describe the region around the feature location. For exam-

ple, there are five differential invariants that use combinations of up to second

derivatives, and nine differential invariants that use combinations of up to third

derivatives. On the other hand, the higher-order derivatives we need, the more diffi-

cult they are to accurately estimate from an image patch, especially in the presence

of noise.

Another approach is to use moment invariants , which are computed using both

image intensities and spatial coordinates. The

x 2

∂

th moment of a function defined

(

m , n

)

over a region is the average value of x m y n f

. The (0,0) moment is thus the average

value of the function, and the (1,0) and (0,1) moments give the center of gravity

with respect to the function. Higher-order moments represent moments of inertia

and skewness. Flusser [ 147 ] derived combinations of moments that were invariant

to rotations based on earlier work by Hu [ 205 ]. Van Gool et al. [ 510 ] enumerated the

affine-invariant moments up to m

(

x , y

)

2, which can be used to construct affine-

invariant feature descriptors. Mikolajczyk and Schmid [ 328 ] suggested that moment

invariants could be applied to x and y gradient images as well.

Schaffalitzky and Zisserman [ 423 ] proposed to use a bank of complex filters whose

magnitude responses are invariant to rotation, indexed by two positive integers and

given by the coefficients

≤

n G

K mn (

) = (

)

(

−

)

(

)

x , y

(4.40)

Computer Vision for Visual Effects

Search WWH ::

Custom Search

Home