Introduction - Learning to Rank for Information Retrieval

Information Technology Reference

In-Depth Information

While information retrieval researchers were suffering from these problems, ma-

chine learning has been demonstrating its effectiveness in automatically tuning pa-

rameters, in combining multiple pieces of evidence, and in avoiding over-fitting.

Therefore, it seems quite promising to adopt machine learning technologies to solve

the aforementioned problems in ranking.

1.3.1 Machine Learning Framework

In many machine learning researches (especially discriminative learning), attention

has been paid to the following key components. 13

1. The input space , which contains the objects under investigation. Usually objects

are represented by feature vectors, extracted according to different applications.

2. The output space , which contains the learning target with respect to the input

objects. There are two related but different definitions of the output space in

machine learning. 14 The first is the output space of the task, which is highly

dependent on the application. For example, in regression, the output space is

the space of real numbers

R

; in classification it is the set of discrete categories

. The second is the output space to facilitate the learning process.

This may differ from the output space of the task. For example, when one uses

regression technologies to solve the problem of classification, the output space

that facilitates learning is the space of real numbers but not discrete categories.

3. The hypothesis space, which defines the class of functions mapping the input

space to the output space. That is, the functions operate on the feature vectors

of the input objects, and make predictions according to the format of the output

space.

4. In order to learn the optimal hypothesis, a training set is usually used, which

contains a number of objects and their ground truth labels, sampled from the

product of the input and output spaces. The loss function measures to what de-

gree the prediction generated by the hypothesis is in accordance with the ground

truth label. For example, widely used loss functions for classification include the

exponential loss, the hinge loss, and the logistic loss. It is clear that the loss func-

tion plays a central role in machine learning, since it encodes the understanding

of the target application (i.e., what prediction is correct and what is not). With

the loss function, an empirical risk can be defined on the training set, and the op-

timal hypothesis is usually (but not always) learned by means of empirical risk

minimization.

{

1 , 2 ,...,K

}

We plot the relationship between these four components in Fig. 1.5 for ease of

understanding.

13 For a more comprehensive introduction to the machine learning literature, please refer to [ 54 ].

14 In this topic, when we mention the output space, we mainly refer to the second type.

Learning to Rank for Information Retrieval

Search WWH ::

Custom Search

Home