Indexing, Object Segmentation, and Event Detection in News and Sports Videos - Multimedia Database Retrieval: Technology and Applications

Database Reference

In-Depth Information

For the training of the face detector, face images and non-face images are

collected from the extended Yale Database and CMU Database, which are the

publicly available face detection databases with large illumination variations. The

detector is trained to detect a face centered in a standard window with a size of

54

48 pixels.

For the local normalization method, the nonlinear histogram equalization was

applied by taking into account histogram distribution over the local window and

combining it with the global histogram distribution. Examples of the filtered results

of the original images are shown in Fig. 7.17 . By the local normalization, it can be

observed from Fig. 7.17 that the histograms of all input images are widely spread to

cover the entire gray scale. The distribution of pixels is not too far from uniform. As

a result, dark images, the histogram components of which are concentrated at the

low end of the gray scales, bright images, the histogram components of which are

biased toward the high end, and low contrast images, the histogram components of

which are narrow and centered toward the middle of the gray scale, are significantly

enhanced to give an appearance of high contrast. By applying local normalization,

an image with varying lighting conditions shows a great deal of gray level detail and

has a high dynamic range. So the system resistance to natural illumination variation

is improved.

Gabor wavelet filters with four scales and eight orientations were applied for

feature extraction. For the purpose of training the detector, a total of 15,599

subjects (8,754 positives and 6,845 negatives) were used. The detector was trained

through cascade AdaBoost classifiers. Real AdaBoost, Gentle AdaBoost and

Modest AdaBoost were compared for error checking with 200 boosting iterations.

Gentle AdaBoost returned a better face detection rate, and was selected as the

detection algorithm.

For testing, the face detection methods were applied to the two databases,

containing various practical aspects in face detection, such as changes in illumi-

nation, poses, size and various faces. Figure 7.18 shows the overall performance

of the methods using ROC curves. The detection results were obtained by setting

the window size of the local normalization to 5

×

48, and all training images are so resized to 54

×

5. The detection method

labeled as GW utilized Gabor wavelets features only, and the method labeled as

GW + LN used combined features of GW and local normalization. These methods

were applied to video data at different illumination conditions. The experimental

results demonstrated that the face detection accuracy is considerably improved

by about 10-15 % by incorporating local normalization in the critical regions of

detection rate vs. false positives. At the same time, false detection rates dropped by

approximately 15 %.

Figures 7.19 and 7.20 show the face detection results from video sequences under

good illumination conditions and bad illumination conditions. It can be observed

that all faces were detected under varying illumination conditions. The size of the

bounding box was determined using the scale of the detected face on the image.

Finally, the face detector was applied on the video sequences containing rotating

poses, varying sizes, and multiple faces. The detection rates are given in Table 7.2 .

Columns 2, 3, 4, and 5 indicate video sequences with good illumination conditions,

×

Multimedia Database Retrieval: Technology and Applications

Search WWH ::

Custom Search

Home