The Pointwise Approach - Learning to Rank for Information Retrieval

Information Technology Reference

In-Depth Information

2.4 Ordinal Regression-Based Algorithms

Ordinal regression takes the ordinal relationship between the ground truth labels

into consideration when learning the ranking model.

Suppose there are K ordered categories. The goal of ordinal regression is to find

a scoring function f , such that one can easily use thresholds b 1 ≤

b 2 ≤···≤

b K − 1

to distinguish the outputs of the scoring function (i.e., f(x j ), j =

1 ,...,m )into

different ordered categories. 2

2, the ordinal regression

will naturally reduce to a binary classification. In this regard, the ordinal regression-

based algorithms have strong connections with classification-based algorithms.

In the literature, there are several methods in this sub-category, such as [ 1 , 3 , 6 ,

20 ], and [ 2 ]. We will introduce some of them as follows.

Note that when we set K

=

2.4.1 Perceptron-Based Ranking (PRanking)

PRanking is a famous algorithm on ordinal regression [ 6 ]. The goal of PRanking is

to find a direction defined by a parameter vector w , after projecting the documents

onto which one can easily use thresholds to distinguish the documents into different

ordered categories.

This goal is achieved by means of an iterative learning process. On iteration t ,

the learning algorithm gets an instance x j associated with query q .Given x j ,the

algorithm predicts

w T x j −

y j =

ˆ

arg min k {

b k < 0

}

. It then receives the ground truth

label y j . If the algorithm makes a mistake (i.e.,

y j ) then there is at least one

threshold, indexed by k , for which the value of w T x j is on the wrong side of b k .To

correct the mistake, we need to move the values of w T x j and b k toward each other.

An example is shown in Fig. 2.3 . After that, the model parameter w is adjusted by

w

y j =

ˆ

x j , just as in many perceptron-based algorithms. This process is repeated

until the training process converges.

Harrington [ 10 ] later proposes using ensemble learning to further improve the

performance of PRanking. In particular, the training data is first sub-sampled,

and a PRanking model is then learned using each sample. After that, the weights

and thresholds associated with the models are averaged to produce the final

model. It has been proven that in this way a better generalization ability can be

achieved [ 12 ].

=

w

+

2 Note that there are some algorithms, such as [ 11 , 13 ], which were also referred to as ordinal

regression-based algorithms in the literature. According to our categorization, however, they be-

long to the pairwise approach since they do not really care about the accurate assignment of a

document to one of the ordered categories. Instead, they focus more on the relative order between

two documents.

Learning to Rank for Information Retrieval

Search WWH ::

Custom Search

Home