Graphics Reference
In-Depth Information
the sampled points average. On the basis of the maximum and minimum x - and y -coordinates
of the projection of all the n samples, a scalar measure for key-point selection is defined,
δ =
max( p x )
min( p x )
min( p y )
max( p y ). A 3D point on the facial surface is considered
+
+
a key point if its
exceeds a certain threshold. The majority of the detected points were located
on and to the sides of the nose region, which is known to be discriminative. However, it is
noticed that a small number of them are located on the forehead region, which also is very
discriminative. This is because the forehead is slightly curved.
PCA features are used as descriptors of the local regions. For matching a pair of surfaces,
the PCA descriptors of the key points are matched against each other. Matched key points
(those with low dissimilarity measures) are used to construct two identical graphs (one on each
facial surface) in which the key points are represented by the graph nodes and their positions
information is stored. A total dissimilarity between the pair of surfaces is computed using the
weighted-sum rule from the average of the descriptor matching errors, the number of key-point
matches, and the average of the absolute distance errors among the corresponding edges and
nodes of the two graphs.
δ
Filtering kernel methods: Some of the well-known approaches to key-point detection in 2D
images are based on filtering kernels, for example, the SIFT by Lowe (2004). A 3D variant
of the SIFT was proposed by Lo and Siebert (2008) and used for face recognition from range
images. For key-point detection, a Gaussian pyramid is constructed (following the 2D SIFT),
each pyramid level is double the resolution (both vertically and horizontally) of the next
level. In each level, there are multiple Gaussian range images, responses of Gaussian kernels
with increasing
parameters. From each two consecutive Gaussian images, a difference of
Gaussian (DoG) range images is found. The extrema (minima and maxima) points in the DoG
images in their spatial and scale (the adjacent DoG images) proximities are considered key
points if their deviations from their proximities exceeds a certain threshold. It is noticed that
the detected 3D SIFT key points are mostly located on the eyes regions, where the acquisition
of the surface is usually unreliable and spiky due to the eyelashes and appear sparse elsewhere.
This is possibly why in their system they also included key points detected on the basis of the
ratio of the principal curvatures
σ
κ 1 2 , if the local maxima exceed a certain threshold. Parallel
to the 2D SIFT, the 3D SIFT descriptors are histograms of the surface gradients around the
key points. Each key point is assigned a direction, usually the dominant gradient direction,
which is used as a reference for in plane rotation (as opposed to in depth rotation) invariance.
A local region of a key point is divided into nine overlapping subregions by means of scaling
by displaced Gaussian functions, from each a histogram is computed. The histogram bins
correspond to specific ranges of directions. The nine histograms are concatenated to form the
key point descriptor.
Another kernel-based approach was proposed by Al-Osaimi et al. (2007). Blob like images
were produced by convolving range images with kernels. By taking the peaks of the blobs as
key points if their difference relative to their spatial neighborhood exceeds a certain thresh-
old, stable and repeatable key points was achieved. The underlying principle of those blob
generating kernels is that a set of adjacent higher spatial frequencies (both vertically and
horizontally) become constructive and destructive at certain locations and forms a pattern of
peaks and bottoms. By allowing those frequencies of a range image to pass, a specific blob like
image (to the range image) is produced. The kernel is found by first setting selected adjacent
frequencies to a value of one in a square matrix (of the kernel size) and setting the discrete
Search WWH ::




Custom Search