Real-Time Face Recognition from Surveillance Video - Intelligent Video Event Analysis and Understanding

Information Technology Reference

In-Depth Information

A set of feature vectors was calculated over a subset of the test videos and used

to train the O PEN CV SVM. The results of training were saved as an XML file that

was used for verification and testing (see section 5.4).

5R su s

The aim of our experiments was to detect faces in a video stream, and extract a reli-

able set of features as rapidly as possible. The extracted features were then tested for

discrimination between different subjects using a Support Vector Machine (SVM).

The process of face detection and local feature extraction is summarised in fig-

ure 3. It can be seen that a candidate face image can be rejected by the system at a

number of points. In order to be passed to the back-end, the face candidate must pass

skin detection, the nested Haar cascades for global and local features and a suitable

set of local features must be identified and extracted. As our highest considerations

were speed and reliability, we simply dropped any frames that could not pass all

these tests. During the experiments, we attempted to reduce the number of dropped

frames without compromising reliability.

Our feature extraction algorithms were developed and tested using still images

from the FERET database. In order to test how our programs worked on video

streams, we used a database of videos of speakers in different poses from CSIT

(see section 1.3).

5.1

Skin Detection

The O PEN CV adaptive skin detector was used to filter the image and to pass face

candidate regions (rather than the whole image) to the Haar cascade detector. Our

results show that performing skin detection on the video frames before passing them

to the Haar cascade improved both speed and reliability.

Use of skin detection helped to eliminated false positives (non-face areas that are

detected as faces). This could also include faces which are not real human faces

( e.g. , a black-and-white photo of a face on a poster or T-shirt).

Sometimes the O PEN CV Haar cascade detector finds the same face object at

different scales (see figure 20). By searching skin regions, we can narrow the search

to find the most meaningful match.

We experimented with using windows of different sizes around detected skin

patches. The results are shown in figures 21 and 22. It can be seen that constraining

the search area too tightly reduced the accuracy of the results. This is because areas

of shadow to the sides of the face are sometimes not detected as skin. The Haar

cascade detector therefore needs an area wider than the skin patch for a successful

detection. As we increased the search window around the skin patches to include

these shadowed areas, the accuracy of the Haar detection gradually increased until

(at around 30-50 pixels) it was as accurate as searching on the whole image.

Obviously the greatest speedup is achieved with a smaller search window. As

we increased the size of the search window, the speedup decreased exponentially.

Search WWH ::

Custom Search

Home