The Pointwise Approach - Learning to Rank for Information Retrieval

Information Technology Reference

In-Depth Information

function. In particular, it defines the loss function based on the cosine similarity

between the score vector outputted by the scoring function f for query q , and the

score vector defined with the ground truth label (referred to as the cosine loss for

short). That is,

j = 1 ϕ(y j )ϕ(f (x j ))

2 j = 1 ϕ 2 (y j ) j = 1 ϕ 2 (f (x j ))

1

2 −

L(f

;

x , y )

=

(2.20)

where ϕ is a transformation function, which can be linear, exponential, or logistic.

After defining the cosine loss, the gradient descent method is used to perform the

optimization and learn the scoring function.

According to [ 17 ], the so-defined cosine loss has the following properties.

•

The cosine loss can be regarded as a kind of regression loss, since it requires the

prediction on the relevance of a document to be as close to the ground truth label

as possible.

•

Because of the query-level normalization factor (the denominator in the loss func-

tion), the cosine loss is insensitive to the varying numbers of documents with

respect to different queries.

•

The cosine loss is bounded between 0 and 1, thus the overall loss on the training

set will not be dominated by specific hard queries.

•

The cosine loss is scale invariant. That is, if we multiply all the ranking scores

outputted by the scoring function by the same constant, the cosine loss will not

change. This is quite in accordance with our intuition on ranking.

2.6 Summary

In this chapter, we have introduced various pointwise ranking methods, and dis-

cussed their relationship with previous learning-based information retrieval models,

and their limitations.

So far, the pointwise approach can only be a sub-optimal solution to ranking. To

tackle the problem, researchers have made attempts on regarding document pairs

or the entire set of documents associated with the same query as the input object.

This results in the pairwise and listwise approaches to learning to rank. With the

pairwise approach, the relative order among documents can be better modeled. With

the listwise approach, the positional information can be visible to the learning-to-

rank process.

2.7 Exercises

2.1 Enumerate widely used loss functions for classification, and prove whether they

are convex.

Learning to Rank for Information Retrieval

Search WWH ::

Custom Search

Home