Information Technology Reference
In-Depth Information
When the number of training queries is finite, the expected risk and the empirical
risk are not necessarily close to each other.
Furthermore, as shown in [ 14 ], IR-SVM [ 7 ] has a query-level stability with co-
efficient τ(n)
4 κ 2
λn . On this basis, one can find that when the number of training
queries approaches infinity, with a large probability the empirical risk of IR-SVM
will converge to its expected risk, at a convergence rate of O( 1
=
n ) . When the num-
ber of queries is finite, the query-level generalization bound is a decreasing function
of the number of training queries.
By comparing the query-level generalization abilities of Ranking SVM and IR-
SVM, we can find that the convergence rates for these two algorithms are both
O( 1
n ) . However, by comparing the case with a finite number of training queries,
the bound for IR-SVM is much tighter than that for Ranking SVM.
17.3.3 For Two-Layer Ranking
As far as we know, there is still no work on the algorithm-dependent generalization
bounds under the two-layer ranking framework.
17.4 Summary
In this chapter, we have introduced the generalization analysis on ranking algo-
rithms, with different types of bounds, and under different statistical ranking frame-
works.
As can be seen, early works assume the use of the document ranking frame-
work. As a result, the corresponding theoretical results can describe the accuracy of
ranking unseen document pairs. However, as mentioned in [ 14 ], such a result is not
highly desired in the context of information retrieval. This is mainly because people
care more about the ranking accuracy for new queries but not for new document
pairs. Some attempts have been made along this direction [ 13 , 14 ], however, these
theoretical results can only explain the change of generalization ability with respect
to the increasing number of training queries, and ignore the influence of the num-
ber of training documents per query on the generalization ability. In order to obtain
such kinds of results, one needs to investigate the generalization theory under the
two-layer ranking framework. Work along this line is, however, still very limited [ 8 ].
For example, currently the two-layer generalization ability of listwise ranking meth-
ods is not clear, and the study on two-layer stability (or other algorithm-dependent
bounds) is still missing. More work is expected to be conducted along this line.
For clarity, we summarize what have been introduced in this chapter using Ta-
bles 17.2 and 17.3 . From the tables, we can see that there are still large space to
explore along this direction. Generalization analysis for ranking is still an emerging
research area, and we can expect its fast development in the future.
Search WWH ::




Custom Search