Information Technology Reference
In-Depth Information
According to the above result, the zero values of the pairwise loss functions are
not sufficient conditions for the zero value of (1
MAP). This is a weaker con-
dition than our intuition. As we know, when the pairwise loss function takes a
zero value, every document pair is correctly classified, and correspondingly we
will get the correct ranking of the documents. In this case, we should find that
(1
MAP) equals zero.
It is not clear whether a similar result can also be obtained for NDCG and other
evaluation measures.
The authors of [ 3 ] try to overcome the aforementioned limitations. Specifically,
they show that many of the pairwise loss functions are upper bounds of a quantity,
named the essential loss for ranking. Furthermore, the essential loss is an upper
bound of (1
NDCG) and (1
MAP), and therefore these loss functions are also
upper bounds of (1
MAP).
To better illustrate this result, we first introduce the concept of the essential loss,
which is constructed by modeling ranking as a sequence of classification tasks.
Given a set of documents x and their ground-truth permutation π y
NDCG) and (1
Ω y ,the
ranking problem can be decomposed into several sequential steps. For each step t ,
one tries to distinguish π y (t) , the document ranked at the t th position in π y , from
all the documents ranked below the t th position in π y , using a scoring function f .
If we denote x (t) ={ x π y (t) ,...,x π y (m) }
, one can define a classifier based on f ,
whose target output is π y (t) ,
T
f( x (t) )
=
arg
max
j ∈{ π y (t),...,π y (m) }
f(x j ).
(5.5)
π 1
{
(t), . . . ,
It is clear that there are m
t possible outputs of this classifier, i.e.,
y
π 1
y
. The 0-1 loss for this classification task can be written as follows, where
the second equation is based on the definition of T
(m)
}
f :
L t f ;
x (t), π y (t) = I
π y (t)
{
T
f( x (t) )
=
}
m
=
1
I
.
(5.6)
{ f(π y (t))>f (π y (j)) }
j
=
t
+
1
A simple example is given in Fig. 5.1 to illustrate the aforementioned process of
decomposition. Suppose there are three objects, A , B , and C , and the correspond-
ing ground-truth permutation is π y = (A,B,C) . Suppose the output of the scoring
function for these objects is ( 2 , 3 , 1 ) , and accordingly the predicted ranked list is
π
(B,A,C) . At step one of the decomposition, the scoring function predicts ob-
ject B to be at the top of the list. However, A should be on the top according to π y .
Therefore, a prediction error occurs. For step two, we remove A from both π y and π .
Then the scoring function predicts object B to be on the top of the remaining list.
This is in accordance with π y and there is no prediction error. After that, we further
remove object B , and it is easy to verify there is no prediction error in step three
=
Search WWH ::

Custom Search