Information Technology Reference
Abstract In this chapter, we summarize the entire topic. In particular, we show the
example algorithms introduced in this topic in a figure. We then provide the answers
to several important questions regarding learning to rank raised at the beginning of
In this topic, we have given a comprehensive overview of the state of the art in
learning to rank.
We have introduced three major approaches to learning to rank. The first is called
the pointwise approach, which reduces ranking to regression, classification, or
ordinal regression on each single document. The second is called the pairwise
approach, which basically formulates ranking as a classification problem on each
document pair. The third is called the listwise approach, which regards ranking as
a new problem, and tries to optimize a measure-specific or non-measure-specific
loss function defined on all the documents associated with a query. We have in-
troduced the representative algorithms of these approaches, discussed their ad-
vantages and disadvantages, and validated their empirical effectiveness on the
LETOR benchmark datasets.
We have mentioned some advanced tasks for learning to rank. These advanced
tasks turn out to be more complex than relevance-based document ranking. For
example, one needs to consider the diversity in the ranking result, to make use
of unlabeled data to improve the training effectiveness, to transfer from one task
to another task or from one domain to another domain, and to apply different
ranking models to different queries.
We have discussed the practical issues on learning to rank, such as data pro-
cessing and applications of learning to rank. Since manual judgment is costly, in
many current practices, the ground truths are mined from the click-through logs
of search engines. Given that there are noise and position bias in the log data, it
is necessary to build some user behavior models to remove their influences. Fur-
thermore, when using an existing learning-to-rank algorithm in real applications,
one needs to go through a procedure, including training data construction, feature
extraction, query and document selection, feature selection, etc.
We have introduced the statistical learning theory for ranking, which turns out to
be very different from the theories for conventional machine learning problems