Image Processing Reference
In-Depth Information
nal points at a random starting point. The response vectors at each log-polar grid
point on a circle are rated by the local model of the selected landmark according
to the output of the corresponding classifier, f loc ( v ). The retina is subsequently
centered at the grid point in the image that maximizes the similarity provided by the
classifier. This procedure is iterated until the retinotopic sensor is centered on a lo-
cal maximum. One advantage of this search strategy is that the search automatically
becomes finer as a local maximum is approached, since the artificial retina is denser
at the center ( fovea ) than at the periphery. As this application demonstrates, the acu-
ity gradient existing between the peripheral and the foveal vision in the topology of
the human retina plays a plausible role in achieving fast convergence to targets by
use of the saccades, homing . After saccades have converged to a local optimum, the
retinotopic grid is displaced in a pixel-by-pixel fashion to maximize the output of the
more accurate, but computationally heavier, extended model for the detected facial
landmark. Matches that score less in classifier-provided comparisons are discarded
at this stage.
Once a match for a facial landmark has been found, a saccade to the average
assumed location of one of the others is performed. An attempt at detection is made
directly with the corresponding extended model. If this fails, the search is restarted
at random to look for this feature.
A global configuration score is computed based exclusively on the quality of
the matches detected. Saccadic search is continued until a complete set of facial
landmarks that has a very high configuration score is found. In Fig. 9.18, the perfor-
mance of this technique is illustrated by way of examples. These images belong to
the XM2VTS database [162]. In 99.5% of the database images, at least two facial
features were correctly positioned. Following the facial feature detection, an authen-
tication of the found face against a reference or client person can be performed by
comparing the measured features of the global models for all three facial features
with those of the client. This can be done by the same classifier that was used to
locate the eyes and mouth+nose, but adapted to the identity of the person, yielding
0.5% false acceptance at 0.5% false rejection threshold. The details of the system
and its performance are presented in [205].
Search WWH ::




Custom Search