Information Technology Reference
In-Depth Information
(1) The U-statistics View In the U-statistics view, suppose that (x u ,y u ) and
(x v ,y v ) are i.i.d. random variables according to distribution P , where X
X
stands
Y
for the document and Y
stands for the ground-truth label of the document.
(x u ,y u ) and (x v ,y v ) construct a pair. Given the scoring function f , a loss occurs
if the documents are not ranked according to their ground truth labels. Suppose the
loss function is L(f ; x u ,x v ,y u,v ) , where y u,v =
1. Again this loss
function can represent pairwise 0-1 loss, or pairwise surrogate loss functions used
by different algorithms. Then the expected risk is defined as
2
·
I { y u y v }
R(f ) =
( X × Y ) 2 L(f ; x u ,x v ,y u,v )P(dx u ,dy u )P(dx v ,dy v ).
(16.3)
The expected risk means the loss that a ranking model f would make for two
random documents. Again, it is almost impossible to compute the expected risk,
and the empirical risk on the training set is used as an estimate of the expected
risk. Given the training data, the empirical risk can be defined with the following
U-statistics:
m
m
2
m(m
R(f )
=
L(f
;
x u ,x v ,y u,v ).
(16.4)
1 )
u =
1
v = u +
1
Specifically, when the ground truth is given as a binary relevance degree, the
positive example is denoted as x +
according to P +
and the negative example is
m +
j =
denoted as x
according to P . Given the training data x + ={
x j }
and x =
1
m
j
x j }
1 (where m + and m are the numbers of positive and negative examples
in the training data respectively), the expected risk and the empirical risk can be
refined as follows:
{
=
x + ,x )P + (dx + )P (dx ),
R(f )
=
L(f
;
(16.5)
2
X
m +
m
1
m + m
R(f )
x + ,x ).
=
L(f
;
(16.6)
u =
1
v =
1
In this case, we usually call the problem a bipartite ranking problem.
(2) The Average View The average view assumes the i.i.d. distribution of docu-
ment pairs. More specifically, with the average view, each document pair (x u ,x v )
is given a ground-truth label y u,v Y ={−
1 , 1
}
, where y u,v =
1 indicates that doc-
ument x u is more relevant than x v and y u,v =−
1 otherwise. Then (x u ,x v ,y u,v ) is
assumed to be a random variable with probabilistic distribution P , and the expected
risk can be defined as
R(f ) =
L(f ; x u ,x v ,y u,v )P (dx u ,dx v ,dy u,v ).
(16.7)
X
2
× Y
Search WWH ::

Custom Search