Applications of Learning to Rank - Learning to Rank for Information Retrieval

Information Technology Reference

In-Depth Information

For regularization purposes, the average of all Perceptron models obtained dur-

ing training is used as the final ranking model. The model has been tested on the

Yahoo! QA data, in terms of P@1 and MRR. The experimental results show that the

learning-to-rank method can significantly outperform several non-learning baseline

methods, and better ranking performances can be achieved when more features are

used in the learning-to-rank process.

14.2.4 Why QA

Why-questions are widely asked in real world. Answers to why-questions tend to

be at least one sentence and at most one paragraph in length. Therefore, passage

retrieval appears to be a suitable approach to Why QA.

In [ 15 ], different learning-to-rank algorithms are empirically investigated to

perform the task of answering why-questions. For this purpose, the Wikipedia

INEX corpus is used, which consists of 659,388 articles extracted from the online

Wikipedia in the summer of 2006, converted to XML format. By applying some

segmentation methods, 6,365,890 passages are generated, which are the objects to

be ranked with respect to given why-questions.

For each paragraph, 37 features are extracted, including TF-IDF, 14 syntactic fea-

tures describing the overlap between QA constituents (e.g., subject, verb, question

focus), 14 WordNet expansion features describing the overlap between the WordNet

synsets of QA constituents, one cue word feature describing the overlap between

candidate answer and a predefined set of explanatory cue words, six document struc-

ture features describing the overlap between question words and document title and

section heading, and one WordNet Relatedness feature describing the relatedness

between questions and answers according to the WordNet similarity tool.

Based on the aforementioned data and feature representations, a number of

learning-to-rank methods are examined, including the pointwise approach (Naive

Bayes, Support Vector Classification, Support Vector Regression, Logistic Regres-

sion), pairwise approach (Pairwise Naive Bayes, Pairwise Support Vector Classifi-

cation, Pairwise Support Vector Regression, Pairwise Logistic Regression, Ranking

SVM), and listwise approach (RankGP). MRR and Success at Position 10 are used

as evaluation measures.

Three factors are considered in the empirical investigation: (1) the distinction

between the pointwise approach, the pairwise approach, and the listwise approach;

(2) the distinction between techniques based on classification and techniques based

on regression; and (3) the distinction between techniques with and without hyper

parameters that must be tuned.

With respect to (1), the experimental results indicate that one is able to obtain

good results with both the pointwise and the pairwise approaches. The optimum

score is reached by Support Vector Regression for the pairwise representation, but

some of the pointwise settings reach scores that are not significantly lower than

this optimum. The explanation is that the relevance labeling of the data is on a

Search WWH ::

Custom Search

Home