Information Technology Reference
InDepth Information
17.3.2 For Subset Ranking
In [
14
], the stability theory [
6
] is extended to perform querylevel generalization
analysis on the pairwise approach, under the framework of subset ranking. In par
ticular, the
average view
is taken in the analysis.
To assist the analysis, the concept of uniform leaveonequeryout pairwise loss
stability (also referred to as
querylevel stability
for short) is proposed in [
14
]. Sup
pose we have learned a ranking model
f
1
from a training set with
n
queries, using
an algorithm
.
Then we randomly remove a query and all its associated document pairs from the
training data, and learn a ranking model
f
2
from the new training set. If the dif
ference between the losses with respect to
f
1
and
f
2
on any unseen document pair
(x
u
,x
v
)
is smaller than
τ(n)
, we say the algorithm
A
. Suppose
L
is the loss function that is minimized in algorithm
A
has uniform leaveonequery
out pairwise loss stability with coefficient
τ
with respect to
L
.
Based on the concept of the querylevel stability, a querylevel generalization
bound has been derived in [
14
], as shown in Theorem
17.8
. The theorem states that if
a pairwise ranking algorithm has querylevel stability, then with a large probability,
the expected querylevel risk can be bounded by the empirical querylevel risk and
a term that depends on the query number and the stability of the algorithm.
A
be a pairwise ranking algorithm
,
(q
1
,S
(
1
)
),...,(q
n
,S
(n)
) be
n training queries and their associated labeled documents
,
and let L be the pairwise
loss function
.
If
1.
Theorem 17.8
Let
A
(q
1
,S
(
1
)
), . . . , (q
n
,S
(n)
),
2
∀
∀
q
∈
Q
,(x
u
,x
v
,y
u,v
)
∈
X
×
Y
,

L(f
(q
i
,S
(i)
)
i
=
1
;
x
u
,x
v
,y
u,v
)
≤
B
,
2.
A
has querylevel stability with coefficient τ
,
(q
i
,S
(i)
)
n
i
−
{
}
then with probability at least
1
δ(
0
<δ<
1
) over the samples of
in the product space
i
=
1
{
Q
×
=
1
2
)
∞
}
(
X
×
Y
,
the following inequality holds
:
log
δ
2
n
+
4
nτ (n)
B
≤
R(f
+
+
R(f
1
)
1
)
2
τ(n)
.
(17.13)
n
i
=
n
i
=
{
(q
i
,S
(i)
)
}
{
(q
i
,S
(i)
)
}
When using this theorem to perform the querylevel generalization analysis on
pairwise ranking algorithms, what one needs to do is to compute the querylevel
stability coefficient
τ(n)
of the algorithms.
For example, as shown in [
14
], Ranking SVM [
11
,
12
] has a querylevel stability
with coefficient
τ(n)
4
κ
2
m
(i)
˜
m
(i)
is the number of document
=
λn
×
max
m
(i)
, where
˜
n
i
=
1
1
κ
2
<
pairs associated with query
q
i
, and
. On this basis, one
can have the following discussions regarding the querylevel generalization ability
of Ranking SVM.
•
∀
x
∈
X
,K(x,x)
≤
∞
When the number of training queries approaches infinity, with a large probability
the empirical risk of Ranking SVM will converge to its expected risk, at a rate of
O(
1
√
n
)
.