Information Technology Reference
In-Depth Information
3 Evaluation Metric
The developed evaluation metric [14] is based on four steps corresponding to:
(i) Objects matching, (ii) Local evaluation of each matched object in terms of
localization and recognition, (iii) Over- and under-detection compensation and
(iv) Global evaluation score computation of the considered interpretation result.
Figure 2 illustrates the different stages on an original image extracted from
the 2007 Pascal VOC challenge. For this image, the ground truth is composed of
4 objects which all belong to the human class. The interpretation result contains
as for it two detected persons. We can note that the first person of the ground
truth is well localized and recognized. The last three persons are well recognized
but poorly localized. Indeed, only one object has been detected instead of three.
The first step, consisting in matching the objects of the ground truth and of
the interpretation result, is done using the PAS metric [4]:
PAS ( I gt ,I i ,u,v )= Card( I r ( u )
I r ( v )
i
gt
)
(1)
Card( I r ( u )
I r ( v )
i
gt
)
with card( I r ( u )
gt
) the number of pixels from the object u in the ground truth,
and card( I r ( v )
i
) the number of pixels from the detected object v in the inter-
pretation result. The number of rows of the resulting matching score matrix
corresponds to the number of objects in the ground truth, and the number of
columns corresponds to the number of objects in the interpretation result. This
matrix is computed, as in [15]. The values range from 0 to 1, 1 corresponding
to a perfect localization. From the matching score matrix, we can match objects
by two methods: the first one consists in using an Hungarian algorithm, which
implies one-to-one matching as in [4]; the second one consists in simply applying
a threshold, which enables multiple detections as in [16]. We use the threshold
method, with a threshold set to 0.2 by default, as it allows that each object of the
interpretation result can be assigned to several objects from the ground truth or
vice-versa. The first person of the ground truth (object 1) is well localized in the
interpretation result (object 2). Their recovery score exceeding the threshold,
they are matched resulting in value 1 in the corresponding cell of the assignment
matrix. Concerning the persons group, only two objects of the ground truth
(objects3and4)arematchedwiththeone object of the interpretation result
(object 1).
The second step consists in the local interpretation evaluation of each matched
object. The localization is first evaluated using the Martin metric [17] adapted
to one object:
card( I r ( u )
gt\i
card( I r ( v )
i\gt
)
)
S loc ( I gt ,I i ,u,v )=min
,
(2)
card( I r ( u )
gt
card( I r ( v )
i
)
)
with card( I r ( u )
gt
) the number of pixels of object u present in the ground truth and
card( I r ( u )
gt\i
) the number of pixels of object u present in the ground truth but not
 
Search WWH ::




Custom Search