Information Technology Reference
In-Depth Information
While the above work uses the concept of the VC dimension for general-
ization analysis, the notion of the bipartite rank-shatter coefficient, denoted as
r( F ,m + ,m ) ,isusedin[ 2 ]. This new notion has a similar meaning to the shatter-
ing coefficient in classification [ 16 , 17 ].
Using the bipartite rank-shatter coefficient as a tool, the following theorem has
been proven in [ 2 ]. For the class of linear scoring functions in the one-dimensional
feature space, r(
,m + ,m ) is a constant, regardless of the values of m +
F
and
m . In this case, the bound converges to zero at a rate of O( max
1
m +
1
{
m }
) ,
and is therefore tighter than the bound based on the VC dimension (as given in
Theorem 17.1 ). For the class of linear scoring functions in the d -dimensional fea-
ture space ( d> 1), r(
,
,m + ,m ) is of the order O((m + m ) d ) , and in this case
the bound has a s im ilar con vergence rate to that based on the VC dimension, i.e.,
O( max
F
log (m + )
m +
, log (m )
m
{
} ) .
F
X
Theorem 17.2 Let
be the class of real-valued functions on
, then with prob-
ability at least 1
δ (0 <δ< 1),
8 (m + +
m )( log δ +
R 0 (f )
R 0 (f )
, 2 m + , 2 m ))
log r(
F
f
F
,
.
m + m
(17.2)
Note that the above two results are only applicable to the case of bipartite ranking.
Some other researchers have extended these results to more general cases, e.g., the
case of k -partite ranking [ 9 , 15 ].
Here we take [ 9 ] as an example. In this work, the labels of documents are no
longer assumed to be positive or negative. Instead, a document is preferred to an-
other one if and only if the label of the former document is larger than that of the
latter document. Specifically, given two documents x u and x v whose labels are y u
and y v , the pairwise 0-1 loss is defined as I { (y u y v )(f (x u ) f(x v ))< 0 }
. In other words,
if the ranking result given by function f is in the same order of that given by the
ground-truth label, the loss is zero; otherwise the loss is one. Based on the pairwise
0-1 loss, the expected risks and empirical risks are defined as below.
R 0 (f )
=
E
[
I
} ]
,
(17.3)
{
(y u
y v )(f (x u )
f(x v ))< 0
m
m
2
m(m
R 0 (f )
=
I
.
(17.4)
{
(y u
y v )(f (x u )
f(x v ))< 0
}
1 )
u
=
1
v
=
u
+
1
Using the properties of U-statistics [ 9 ], the following generalization bound is
obtained with respect to the aforementioned empirical and expected risks.    Search WWH ::

Custom Search