Information Technology Reference
In-Depth Information
Chapter 15
Statistical Learning Theory for Ranking
Abstract In this chapter, we introduce the statistical learning theory for ranking. In
order to better understand existing learning-to-rank algorithms, and to design better
algorithms, it is very helpful to deeply understand their theoretical properties. In
this chapter, we give the big picture of theoretical analysis for ranking, and point
out several important issues to be investigated: statistical ranking framework, gen-
eralization ability, and statistical consistency for ranking methods.
15.1 Overview
As a new machine learning problem, ranking is not only about effective algorithms
but also about the theory behind these algorithms.
Actually, theoretical analysis on an algorithm always plays an important role in
machine learning. This is because in practice one can only observe experimental re-
sults on relatively small-scale datasets (e.g., the experimental results on the LETOR
benchmark datasets as introduced in Chap. 11). To some extent, such empirical re-
sults might not be fully reliable, because a small training set sometimes cannot fully
realize the potential of a learning algorithm, and a small test set sometimes cannot
reflect the true performance of an algorithm. This is due to the fact that the input
and output spaces are too large to be well represented by a small number of sam-
ples. In this regard, theories are sorely needed in order to analyze the performance
of a learning algorithm in the case that the training data are infinite and the test data
are randomly sampled from the input and output spaces.
15.2 Statistical Learning Theory
In Fig. 15.1 , we give a typical taxonomy of statistical learning theory for empirical
risk minimization, which summarizes most of the theoretical aspects that one is
interested in.
Basically, to understand a learning problem theoretically, we need to have a sta-
tistical framework, which describes the assumptions on the data generation (e.g.,
whether the instances are i.i.d.). We also need to define the true loss of the learning
Search WWH ::




Custom Search