Information Technology Reference
In-Depth Information
18.3.1 Regarding DCG-Based Ranking Error
In [
6
], the consistency issue with respect to DCG-based ranking error is studied. In
particular, the study is focused on a specific regression-based ranking method. The
loss function minimized by the method has the following form:
m
w
j
f(x
j
)
−
y
j
2
w
j
max
0
,
f(x
j
)
−
δ
2
.
L
φ
(f,
x
,
y
)
=
+
u
sup
j
(18.5)
j
=
1
The weight function
w
j
is chosen so that it focuses only on the most important
examples (e.g., those examples whose
y
j
=
1) while
w
j
focuses on the examples
not covered by
w
j
(e.g., those examples whose
y
j
=
0). This loss function basically
requires that the output of the ranking function
f
on a relevant document should
be as close to 1 as possible, and at the same time requires that the output on an
irrelevant document should not be larger than a small positive value
δ
by much.
As for the minimization of the above loss function, the following result has been
obtained.
Theorem 18.2
For all scoring functions f
,
let r
f
=
f be the induced ranking
model
.
Let f
B
be the Bayes optimal scoring function
,
i
.
e
.,
f
B
(x
j
)
=
y
j
,
and denote
the corresponding ranking model as r
B
=
sort
◦
f
B
.
Suppose f
∗
sort
◦
is the scoring
function learned by minimizing E
,
then in certain conditions
(
see
[
6
]
for more details
)
the following inequality holds
:
E
DCG(r
B
,
x
,
y
)
−
[
L
φ
(f
;
x
,
y
)
]
E
DCG(r
f
,
x
,
y
)
C
E
L
φ
(f
x
,
y
)
−
E
L
φ
(f
∗
;
x
,
y
)
1
/
2
,
≤
;
(18.6)
where C is a constant
.
The above theorem basically says that in certain conditions, the minimization of
the expectation of the regression loss given in (
18.5
) will lead to the maximization of
the expected DCG value. In other words, the regression loss in (
18.5
) is statistically
consistent with DCG-based ranking error.
18.3.2 Regarding Permutation-Level 0-1 Loss
In [
13
], the consistency of listwise ranking methods is investigated, with respect to
the permutation-level 0-1 loss. The permutation-level 0-1 loss takes a value of 0 if
the ranked list given by the ranking function is exactly the same as the ground-truth
permutation; and takes a value of 1 otherwise.
In particular, two conditions for the consistency are given in [
13
]. One is for the
underlying probability space, called
order preserving
. The other is for the surrogate
loss function, called
order sensitive
. The intuitive explanation of these two concepts
are as follows.