Information Technology Reference
In-Depth Information
For regularization purposes, the average of all Perceptron models obtained dur-
ing training is used as the final ranking model. The model has been tested on the
Yahoo! QA data, in terms of P@1 and MRR. The experimental results show that the
learning-to-rank method can significantly outperform several non-learning baseline
methods, and better ranking performances can be achieved when more features are
used in the learning-to-rank process.
14.2.4 Why QA
Why-questions are widely asked in real world. Answers to why-questions tend to
be at least one sentence and at most one paragraph in length. Therefore, passage
retrieval appears to be a suitable approach to Why QA.
In [ 15 ], different learning-to-rank algorithms are empirically investigated to
perform the task of answering why-questions. For this purpose, the Wikipedia
INEX corpus is used, which consists of 659,388 articles extracted from the online
Wikipedia in the summer of 2006, converted to XML format. By applying some
segmentation methods, 6,365,890 passages are generated, which are the objects to
be ranked with respect to given why-questions.
For each paragraph, 37 features are extracted, including TF-IDF, 14 syntactic fea-
tures describing the overlap between QA constituents (e.g., subject, verb, question
focus), 14 WordNet expansion features describing the overlap between the WordNet
synsets of QA constituents, one cue word feature describing the overlap between
candidate answer and a predefined set of explanatory cue words, six document struc-
ture features describing the overlap between question words and document title and
section heading, and one WordNet Relatedness feature describing the relatedness
between questions and answers according to the WordNet similarity tool.
Based on the aforementioned data and feature representations, a number of
learning-to-rank methods are examined, including the pointwise approach (Naive
Bayes, Support Vector Classification, Support Vector Regression, Logistic Regres-
sion), pairwise approach (Pairwise Naive Bayes, Pairwise Support Vector Classifi-
cation, Pairwise Support Vector Regression, Pairwise Logistic Regression, Ranking
SVM), and listwise approach (RankGP). MRR and Success at Position 10 are used
as evaluation measures.
Three factors are considered in the empirical investigation: (1) the distinction
between the pointwise approach, the pairwise approach, and the listwise approach;
(2) the distinction between techniques based on classification and techniques based
on regression; and (3) the distinction between techniques with and without hyper
parameters that must be tuned.
With respect to (1), the experimental results indicate that one is able to obtain
good results with both the pointwise and the pairwise approaches. The optimum
score is reached by Support Vector Regression for the pairwise representation, but
some of the pointwise settings reach scores that are not significantly lower than
this optimum. The explanation is that the relevance labeling of the data is on a
Search WWH ::




Custom Search