Information Technology Reference
In-Depth Information
Theorem 17.3 With probability at least 1
δ (0 <δ< 1), the following inequality
holds :
V( F )
m
ln δ
m
R 0 (f ) R 0 (f )
4 c
+
4
1 ,
(17.5)
F ={
, and V( F
where
sgn (f (x u )
f(x v ))
;
f
F }
) is its VC dimension ; c is a
universal constant .
As compared to the results in [ 10 ], the bound as shown in Theorem 17.3 seems
tighter, although it is also based on the VC dimension. This is mainly due to the
consideration of the properties of U-statistics in the analysis.
In addition to the discussions on the pairwise 0-1 loss, in [ 9 ], the generaliza-
tion bounds with respect to surrogate loss functions are also discussed. Suppose the
surrogate loss function is φ , then the expected and empirical surrogate risks can be
defined as below.
R φ (f ) = E φ
sgn (y u y v ) · f(x u ) f(x v ) ,
(17.6)
m
m
φ
· f(x u )
f(x v ) .
2
m(m
R φ (f )
=
sgn (y u
y v )
(17.7)
1 )
u =
v = u +
1
1
The corresponding generalization bound is shown in the following theorem.
Theorem 17.4 With probability at least 1
δ( 0 <δ< 1 ) , the following inequality
holds :
B 2 ln δ
2 m
R φ (f )
R φ (f )
V
m +
4 cBφ (B)
.
(17.8)
F ={
, and V( F
;
F }
Here
) is its VC dimension ; c is a
universal constant ; the output of function f is uniformly bounded by
sgn (f (x u )
f(x v ))
f
B and B .
17.2.2 For Subset Ranking
In [ 13 ], the uniform bounds for the listwise surrogate loss functions are obtained in
the setting of subset ranking. Since queries are considered i.i.d. random variables
in subset ranking, the corresponding generalization bound is also referred to as the
query-level generalization bound. Under the subset ranking framework, the expected
and empirical risks are defined as in (16.13) and (16.14) (see Chap. 16).
Technically, the theory of the Rademacher average (RA) [ 4 , 5 ]isusedinthe
analysis. RA measures how much the function class can fit random noise, which is
defined below.     Search WWH ::

Custom Search