Information Technology Reference
In-Depth Information
its counterpart; (iii) there exists a permutation, for which the speed of change in loss
with respect to the score of a document will become faster if exchanging its position
with another document with the same score but ranked lower.
Based on the two new concepts, a theorem similar to Theorem
18.3
has been
obtained, which is now shown.
Theorem 18.4
Let L
φ
(f
m
documents
,
if its top-k subgroup probability space is order preserving with respect
to m
;
x
,π
y
) be a top-k order sensitive loss function
.
Fo r
∀
k
i
=
m
−
k
−
1
document pairs
{
(j
i
,j
i
+
1
)
}
1
and
{
(j
k
+
s
i
,j
k
+
i
:
0
≤
s
i
<i)
}
,
then the
i
=
2
loss L
φ
(f
;
x
,π
y
) is consistent with the top-k true loss
.
In addition to the above result, the change of statistical consistency with respect
to different
k
values has also been discussed in [
12
], and the following conclusions
are obtained.
•
For the consistency with the top-
k
true loss, when
k
becomes smaller, the re-
quirement on the probability space becomes weaker but the requirement on the
surrogate loss function becomes stronger.
•
If we fix the true loss to be top-
k
and the probability space to be top-
k
order pre-
serving, the surrogate loss function should be at most top-
l
(
l
≤
k
) order sensitive
in order to meet the consistency conditions.
According to Theorem
18.4
, the surrogate loss functions optimized by ListMLE
and ListNet are not consistent with the top-
k
true loss. Modifications to the algo-
rithms are needed to make them consistent. For this purpose, one should replace the
permutation-level Plackett-Luce model in ListMLE with the top-
k
Plackett-Luce
model; and change the mapping function in ListNet to retain the order for the top-
k
positions in the ground-truth permutation and assign to all the remaining positions
a small value (which is smaller than the score of any document ranked at the top-
k
positions). Experimental results have shown that with such modifications, the per-
formances of the algorithms can be enhanced in terms of the top-
k
true loss.
18.3.4 Regarding Weighted Kendall's τ
In [
10
], the consistency of some pairwise ranking methods and listwise ranking
methods with respect to the weighted Kendall's
τ
is studied. The weighted Kendall's
τ
is defined as follows:
m
−
1
m
L
0
(f
;
x
,π
y
)
=
D(u,v)I
{
f(x
π
−
1
y
,
(18.7)
(u)
)
−
f(x
π
−
1
y
(v)
)
≤
0
}
u
=
1
v
=
u
+
1
where the ground truth is assumed to be a permutation
π
y
,
D(u,v)
denotes a weight
function on a pair of documents whose positions are
u
and
v
in the ground-truth
permutation respectively. When
D(u,v)
=
D(u)
−
D(v)
, we call it the Difference-
Weighting Kendall's
τ
. When
D(u,v)
=
1, we call it the unweighted Kendall's
τ
.