Information Technology Reference
In-Depth Information
17.3.2 For Subset Ranking
In [
14
], the stability theory [
6
] is extended to perform query-level generalization
analysis on the pairwise approach, under the framework of subset ranking. In par-
ticular, the
average view
is taken in the analysis.
To assist the analysis, the concept of uniform leave-one-query-out pairwise loss
stability (also referred to as
query-level stability
for short) is proposed in [
14
]. Sup-
pose we have learned a ranking model
f
1
from a training set with
n
queries, using
an algorithm
.
Then we randomly remove a query and all its associated document pairs from the
training data, and learn a ranking model
f
2
from the new training set. If the dif-
ference between the losses with respect to
f
1
and
f
2
on any unseen document pair
(x
u
,x
v
)
is smaller than
τ(n)
, we say the algorithm
A
. Suppose
L
is the loss function that is minimized in algorithm
A
has uniform leave-one-query-
out pairwise loss stability with coefficient
τ
with respect to
L
.
Based on the concept of the query-level stability, a query-level generalization
bound has been derived in [
14
], as shown in Theorem
17.8
. The theorem states that if
a pairwise ranking algorithm has query-level stability, then with a large probability,
the expected query-level risk can be bounded by the empirical query-level risk and
a term that depends on the query number and the stability of the algorithm.
A
be a pairwise ranking algorithm
,
(q
1
,S
(
1
)
),...,(q
n
,S
(n)
) be
n training queries and their associated labeled documents
,
and let L be the pairwise
loss function
.
If
1.
Theorem 17.8
Let
A
(q
1
,S
(
1
)
), . . . , (q
n
,S
(n)
),
2
∀
∀
q
∈
Q
,(x
u
,x
v
,y
u,v
)
∈
X
×
Y
,
|
L(f
(q
i
,S
(i)
)
i
=
1
;
x
u
,x
v
,y
u,v
)
|≤
B
,
2.
A
has query-level stability with coefficient τ
,
(q
i
,S
(i)
)
n
i
−
{
}
then with probability at least
1
δ(
0
<δ<
1
) over the samples of
in the product space
i
=
1
{
Q
×
=
1
2
)
∞
}
(
X
×
Y
,
the following inequality holds
:
log
δ
2
n
+
4
nτ (n)
B
≤
R(f
+
+
R(f
1
)
1
)
2
τ(n)
.
(17.13)
n
i
=
n
i
=
{
(q
i
,S
(i)
)
}
{
(q
i
,S
(i)
)
}
When using this theorem to perform the query-level generalization analysis on
pairwise ranking algorithms, what one needs to do is to compute the query-level
stability coefficient
τ(n)
of the algorithms.
For example, as shown in [
14
], Ranking SVM [
11
,
12
] has a query-level stability
with coefficient
τ(n)
4
κ
2
m
(i)
˜
m
(i)
is the number of document
=
λn
×
max
m
(i)
, where
˜
n
i
=
1
1
κ
2
<
pairs associated with query
q
i
, and
. On this basis, one
can have the following discussions regarding the query-level generalization ability
of Ranking SVM.
•
∀
x
∈
X
,K(x,x)
≤
∞
When the number of training queries approaches infinity, with a large probability
the empirical risk of Ranking SVM will converge to its expected risk, at a rate of
O(
1
√
n
)
.