Evaluation of Human Detection Algorithms in Image Sequences - Advanced Concepts for Intelligent Vision Systems

Information Technology Reference

In-Depth Information

present in the interpretation result. This metric has been chosen according to

the comparative study conducted in [18] on the performances of 33 localization

metrics face to different alterations like translation, scale change, rotation... The

obtained localization score ranges from 0 to 1, 0 corresponding to a perfect

recovery between the two objects and consequently to a perfect localization.

We can note that all the matched objects are quite well localized obtaining

low scores, the poorest score 0.065 corresponding to the second object of the

interpretation result, namely the lonely person. The evaluation of the recognition

part consists in comparing the class of the object in the ground truth and in the

interpretation result. This comparison can be done in different ways. A distance

matrix between each class present in the database can be for example provided,

which would enable to precisely evaluate recognition mistakes. On an other way,

numerous real systems track one specific class of objects and do not tolerate

some approximation in the recognition step. They work in an all or nothing

scheme. S rec ( I gt ,I i ,u,v ) = 0 if classes are the same and 1 otherwise. It is the

case in the developped human detection system where all detections correspond

de facto to the right class, namely a human. The recognition evaluation matrix

containing only ones, the misclassification is then indirectly highly penalized

through the over and under-detection compensation. As we have to maintain an

important weight for the penalization of bad localization, we choose a high value

of the α parameter ( α =0 . 8). We finally compute the local interpretation score

S ( u, v ) between two matched objects as a combination of the localization and

the recognition scores:

S ( u, v )= α

∗

S loc ( I gt ,I i ,u,v )+(1

−

α )

∗

S rec ( I gt ,I i ,u,v )

(3)

The third step is the compensation one. Working on the assignment matrix,

empty rows or columns are tracked and completed. In our example, there is no

empty column meaning that all objects of the interpretation result have been

matched with at least one object of the ground truth. There is consequently no

over-detection. On the other hand, one row (2) is empty; one object of the ground

truth has not been detected. This under-detection is compensated adding one

column with score 1 at the corresponding line.

Finally, the global interpretation score is computed, taking into account the

compensation stage and averaging the local interpretation scores.

4 Evaluation of Human Detection Algorithms

In order to evaluate the detection methods presented in section 2, we realized

a set of reference scenarios corresponding to the specific needs expressed by

the industrial partners involved in the CAPTHOM project. An extract of a

scenario example is presented in figure 3. At each location, a set of characteristics

(temperature, speed, posture, activity...) is associated with the formalism defined

within the CAPTHOM project [19].

The three classes of scenarios from which we have built the evaluation dataset

are:

Advanced Concepts for Intelligent Vision Systems

Search WWH ::

Custom Search

Home