Information Technology Reference
In-Depth Information
18.2 Consistency Analysis for Document Ranking
18.2.1 Regarding Pairwise 0-1 Loss
In [
5
], the consistency for pairwise ranking methods is discussed, under the frame-
work of document ranking.
Give two documents
x
u
and
x
v
whose labels are
y
u
and
y
v
, the pairwise 0-1 loss
is defined as
I
{
(y
u
−
y
v
)(f (x
u
)
−
f(x
v
))<
0
}
. In other words, if the ranking result given by
function
f
is in the same order of that given by the ground-truth label, the loss is
zero; otherwise the loss is one. Based on the pairwise 0-1 loss, the expected true risk
is defined as (the expectation is taken over documents according to the document
ranking framework)
R
0
(f )
=
E
[
I
{
(y
u
−
y
v
)(f (x
u
)
−
f(x
v
))<
0
}
]
.
(18.1)
The expected surrogate risk is defined by
E
φ
−
·
f(x
u
)
f(x
v
)
,
R
φ
(f )
=
sgn
(y
u
−
y
v
)
−
(18.2)
where
φ
is a convex function, e.g., the exponential, logistic, and hinge function.
The following regret bound is derived in [
5
].
Theorem 18.1
Suppose R
0
=
inf
f
R
0
(f ) and R
φ
=
inf
f
R
φ
(f )
.
Then for all func-
tions f
,
as long as φ is convex
,
we have
R
0
(f )
−
R
0
≤
ψ
−
1
R
φ
(f )
−
R
φ
,
(18.3)
where
H
−
1
H
−
1
,
+
−
x
x
ψ(x)
=
−
2
2
0
ρφ(
ρ)φ(α)
.
H
−
(ρ)
=
inf
−
α)
+
(
1
−
(18.4)
α
:
α(
2
ρ
−
1
)
≤
This theorem basically says that if an algorithm can effectively minimize the
expected surrogate risk to its minimum (the right-hand side of the inequality ap-
proaches zero), then the expected true risk will also be minimized to its minimum
(the left-hand side of the inequality also approaches zero). In other words, the algo-
rithm is statistically consistent if
φ
is a convex function. According to the result, it
is not difficult to verify that Ranking SVM [
8
,
9
], RankBoost [
7
], and RankNet [
3
]
are all consistent with the pairwise 0-1 loss.
18.3 Consistency Analysis for Subset Ranking
There are several works on consistency analysis under the subset ranking frame-
work. All these works assume that the expected risks are computed over the queries.