Information Technology Reference
In-Depth Information
18.3.1 Regarding DCG-Based Ranking Error
In [ 6 ], the consistency issue with respect to DCG-based ranking error is studied. In
particular, the study is focused on a specific regression-based ranking method. The
loss function minimized by the method has the following form:
m
w j f(x j ) y j 2
w j max 0 , f(x j ) δ 2 .
L φ (f, x , y ) =
+ u sup
j
(18.5)
j
=
1
The weight function w j is chosen so that it focuses only on the most important
examples (e.g., those examples whose y j =
1) while w j
focuses on the examples
not covered by w j (e.g., those examples whose y j =
0). This loss function basically
requires that the output of the ranking function f on a relevant document should
be as close to 1 as possible, and at the same time requires that the output on an
irrelevant document should not be larger than a small positive value δ by much.
As for the minimization of the above loss function, the following result has been
obtained.
Theorem 18.2 For all scoring functions f , let r f =
f be the induced ranking
model . Let f B be the Bayes optimal scoring function , i . e ., f B (x j ) = y j , and denote
the corresponding ranking model as r B =
sort
f B . Suppose f
sort
is the scoring
function learned by minimizing E
, then in certain conditions ( see [ 6 ]
for more details ) the following inequality holds :
E DCG(r B , x , y )
[
L φ (f
;
x , y )
]
E DCG(r f , x , y )
C E L φ (f
x , y )
E L φ (f ;
x , y ) 1 / 2 ,
;
(18.6)
where C is a constant .
The above theorem basically says that in certain conditions, the minimization of
the expectation of the regression loss given in ( 18.5 ) will lead to the maximization of
the expected DCG value. In other words, the regression loss in ( 18.5 ) is statistically
consistent with DCG-based ranking error.
18.3.2 Regarding Permutation-Level 0-1 Loss
In [ 13 ], the consistency of listwise ranking methods is investigated, with respect to
the permutation-level 0-1 loss. The permutation-level 0-1 loss takes a value of 0 if
the ranked list given by the ranking function is exactly the same as the ground-truth
permutation; and takes a value of 1 otherwise.
In particular, two conditions for the consistency are given in [ 13 ]. One is for the
underlying probability space, called order preserving . The other is for the surrogate
loss function, called order sensitive . The intuitive explanation of these two concepts
are as follows.
Search WWH ::




Custom Search