Introduction - Learning to Rank for Information Retrieval

Information Technology Reference

In-Depth Information

“Nothing is more practical than theory” [ 77 ]. After introducing the algorithms

and their applications, we will turn to the theoretical part of learning to rank. In

particular, we will discuss the theoretical guarantee of achieving good ranking per-

formance on unseen test data by minimizing the loss function on the training data.

This is related to the generalization ability and statistical consistency of ranking

methods. We will make discussions on these topics in Chaps. 15, 16, 17 and 18.

In Chaps. 19 and 20, we will summarize the topic and present some future re-

search topics.

As for the writing of the topic, we do not aim to be fully rigorous. Instead we try

to provide insights into the basic ideas. However, it is still unavoidable that we will

use mathematics for better illustration of the problem, especially when we jump into

the theoretical discussions on learning to rank. We will have to assume familiarity

with basic concepts of probability theory and statistical learning in the correspond-

ing discussions. We have listed some basics of machine learning, probability theory,

algebra, and optimization in Chaps. 21 and 22. We also provide some related mate-

rials and encourage readers to refer to them in order to obtain a more comprehensive

overview of the background knowledge for this topic.

Throughout the topic, we will use the notation rules as listed in Table 1.3 .Here

we would like to add one more note. Since in practice the hypothesis h is usually

defined with scoring function f , we sometimes use L(h) and L(f ) interchangeably

to represent the loss function. When we need to emphasize the parameter in the

scoring function f , we will use f(w,x) instead of f(x) in the discussion, although

they actually mean the same thing. We sometimes also refer to w as the ranking

model directly if there is no confusion.

1.5 Exercises

1.1 How can one estimate the size of the Web?

1.2 Investigate the relationship between the formula of BM25 and the log odds of

relevance.

1.3 List different smooth functions used in LMIR, and compare them.

1.4 Use the view of the Markov process to explain the PageRank algorithm.

1.5 Enumerate all the applications of ranking that you know, in addition to docu-

ment retrieval.

1.6 List the differences between generative learning and discriminative learning.

1.7 Discuss the connections between different evaluation measures for informa-

tion retrieval.

1.8 Given text classification as the task, and given linear regression as the algo-

rithms, illustrate the four components of machine learning in this case.

1.9 Discuss the major differences between ranking and classification (regression).

1.10 List the major differences between the three approaches to learning to rank.

Search WWH ::

Custom Search

Home