Database Reference
In-Depth Information
4.2.6.2
Hit-Rate Curve
The hit-rate curve presents the hit ratio as a function of the quota size. Hit-
rate is calculated by counting the actual positive labeled instances inside
a determined quota [ An and Wang (2001) ] . More precisely, for a quota of
size j and a ranked set of instances, hit-rate is defined as:
k =1
j
t [ k ]
Hit-Rate( j )=
,
(4.10)
j
where t [ k ] represents the truly expected outcome of the instance located
in the k 'th position when the instances are sorted according to their
conditional probability for “positive” by descending order. Note that if the
k 'th position can be uniquely defined (i.e. there is exactly one instance that
can be located in this position) then t [ k ] is either 0 or 1 depending on the
actual outcome of this specific instance. Nevertheless, if the k 'th position
is not uniquely defined and there are m k, 1 instances that can be located in
this position, and m k, 2 of which are truly positive, then:
t [ k ] = m k, 2 / m k, 1
.
(4.11)
The sum of t [ k ] over the entire test set is equal to the num-
ber of instances that are labeled “positive”. Moreover, Hit - Rate ( j )
P recision ( p [ j ] )where p [ j ] denotes the j 'th order of
P I ( pos
|
x 1 ) ,...,
P I ( pos
x m ). The values are strictly equal when the value of j 'th is uniquely
defined.
It should be noted that the hit-rate measure was originally defined
without any reference to the uniqueness of a certain position. However, there
are some classifiers that tend to provide the same conditional probability
to several different instances. For instance, in a decision tree, any instances
in the test set that belongs to the same leaf get the same conditional
probability. Thus, the proposed correction is required in those cases.
Figure 4.4 illustrates a hit-curve.
|
4.2.6.3
Qrecall ( Quota Recall )
Thehit-ratemeasure,presentedabove, is the “precision” equivalent for
quota-limited problems. Similarly, we suggest the Qrecall (for quota recall)
to be the “recall” equivalent for quota-limited problems. The Qrecall for
a certain position in a ranked list is calculated by dividing the number of
positive instances, from the head of the list until that position, by the total
Search WWH ::




Custom Search