Evaluation of Human Detection Algorithms in Image Sequences - Advanced Concepts for Intelligent Vision Systems

Information Technology Reference

In-Depth Information

on the whole database. An interpretation evaluation metric, taking into account

both aspects and working on a single interpretation result, is then needed.

This article presents our works concerning the development of vision-based sys-

tems for human detection and tracking in a known environment using a static cam-

era and the definition of an adaptable performance measure able to simultaneously

evaluate the localization, the recognition and the detection of interpreted objects

in a real scene using a manually made ground truth. If in a general way, the local-

ization and the recognition have to be as precise as possible, the relative impor-

tance of these two aspects can change depending of the foreseen application. We

describe in section 2 the successive algorithms implemented for the CAPTHOM

project which more particularly focused on indoor environments. The proposed

evaluation metric of a general image interpretation result is presented in section

3. Its potential interest is illustrated in section 4 on the CAPTHOM project.

Section 5 presents conclusions and perspectives of this study.

2 Visual-Based System Developments for Human

Detection in Image Sequences

Within the CAPTHOM project, we attempt to develop a human detection

system to limit power consumption of buildings and to monitor low mobility

persons. This project belongs to the numerous applications of human detection

systems for home automation, video surveillance, etc. The foreseen system must

be easily tunable and embeddable, providing an optimal compromise between

false detection rate and algorithmic complexity.

The development of a reliable human detection system in videos deals with

general object detection diculties (background complexity, illumination con-

ditions etc.) and with other specific constraints involved with human detection

(high variability in skin color, weight and clothes, presence of partial occlusions,

highly articulated body resulting in various appearances etc.). Despite of these

di culties, some very promising systems have already been proposed in the lit-

erature. It is especially the case of the method proposed by Viola and Jones [8]

which attempts to detect humans in still images using a well-suited representa-

tion of human shapes and a classification method. We first of all implemented

this method in a sliding window framework analyzing every image and using

several classifiers. This method is based on Haar-like filters and adaboost. In an

indoor environment, partial occlusions are actually frequent. The upper part of

the body (head and shoulders) is often the only visible part. As it is clearly insuf-

ficient to seek in the image only forms similar to the human body in its whole,

we implemented four classifiers: the whole body, the upper-body (front/back

view), the upper-body (left view) and the upper-body (right view). In a practi-

cal way, the classifier analyzes the image with a constant shift in the horizontal

and vertical direction. As the size of the person potentially present is not known

a priori and the classifier has a fixed size, the image is analyzed several times

by modifying the scale. The size of the image is divided by a scale factor ( sf )

between two scales. This method is called Viola [8] in the following paragraphs.

Advanced Concepts for Intelligent Vision Systems

Search WWH ::

Custom Search

Home