Information Technology Reference
In-Depth Information
If we only have a limited budget to label C documents in total, according to Theo-
rem 17.6 , there is an optimal trade off between the number of training queries and
that of training documents per query. This is consistent with previous empirical
findings in [ 18 ].
Actually one can attain the optimal trade off by solving the following optimiza-
tion problem:
2 M 2 log ( δ )
n
min
n,m ( 1 ) ,...,m (n)
D(L F ,n) +
D L
, m (i)
2
n
n
2 M 2 log δ
m (i) n 2
1
n
+
+
F
i
=
1
i
=
1
n
s . t .
m i =
C.
i =
1
This optimization problem is easy to solve. For example, if VC( F
)
=
V ,
for th e pairwise 0-1 loss, the solution to the optimization problem is n
=
+ 2lo g ( 4 /δ)
c 1 2 V
c 1 V
C , m (i)
C
n
, where c 1 is a constant. From this result we note
the following.
n decreases with the increasing capacity of the function class. That is, we should
label fewer queries and more documents per query when the hypothesis space is
larger.
For fixed hypothesis space, n increases with the confidence level δ . That is, we
should label more query if we want the bound to hold with a larger probability.
The above findings can be used to explain the behavior of existing pairwise rank-
ing algorithms, and can be used to guide the construction of training set for learning
to rank.
17.3 Algorithm-Dependent Generalization Bound
Note that the uniform generalization bound is valid for any function in the hypoth-
esis space under investigation, no matter whether the function is the optimal one
learned by a given algorithm or not. Therefore, it is understandable that the bound
might not be very tight. In contrast, it is possible that we can get a tighter bound for
the optimal function learned by the algorithm. This is just the motivation of devel-
oping the algorithm-dependent generalization bounds. 2
Again, in this section, we
2 Note that the disadvantage of algorithm-dependent bounds lies in that they can only be used for
specific algorithms, and may not be derived for every algorithm.
Search WWH ::




Custom Search