Database Reference
In-Depth Information
NLP (such as part-of-speech tagging and sentence parsing), is increasingly
being achieved through machine learning. Li and Roth (27), Hacioglu and
Ward (16) and Zhang and Lee (40) have used supervised learning for question
classification.
The use of machine learning has enabled the above systems to handle larger
datasets and more complex type systems. A benchmark available from UIUC 1
is now standard. It has 6 coarse and 50 fine answer types in a two-level
taxonomy, together with 5500 training and 500 test questions. Webclopedia
(18) has also published its taxonomy with over 140 types.
Compared to other areas of text mining, question classification has benefited
from machine learning somewhat less than one might expect.
Li and Roth (27) used question features like tokens, parts of speech (POS),
chunks (non-overlapping phrases) and named entity (NE) tags. Some of
these features, such as part-of-speech, may themselves be generated from
sophisticated inference methods. Li and Roth achieved 78.8% accuracy for
50 classes. On using a hand-built dictionary of “semantically related words”
(unpublished, to our knowledge) the accuracy improved to 84.2%. It seems
desirable to use only off-the-shelf knowledge bases and labeled training data
consisting of questions and their atypes. Designing and maintaining the
dictionary may be comparable in effort to maintaining a rule base.
Support Vector Machines (SVMs) (38) have been widely successful in many
other learning tasks. SVMs were applied to question classification shortly
after the work of Li and Roth. Hacioglu and Ward (16) used linear support
vector machines with a very simple set of features: question word 2-grams.
E.g., the question “What is the tallest mountain in Africa?” leads to features
what is , is the , the tallest , ..., etc., which can be collected in a bag of 2-
grams. (It may help to mark the beginning 2-gram in some special way.) They
did not use any named-entity tags or related word dictionary. Early SVM
formulations and implementations usually handled two classes. Hacioglu and
Ward used a technique by Dietterich and Bakiri (12) to adapt two-class SVMs
to the multiclass setting in question classification. The high-level idea is to
represent class labels with carefully chosen numbers, represent the numbers in
the binary system and have one SVM predict each bit position. This is called
the “error-correcting output code” (ECOC) approach. The overall accuracy
was 80.2-82%, slightly higher than Li and Roth's baseline.
Zhang and Lee (40) used linear SVMs with all possible question word q -
grams, i.e., the above question now leads to features what , what is , what is
the , ..., is , is the , is the tallest , ..., etc. They obtained an accuracy of
79.2% without using ECOC, slightly higher than the Li and Roth baseline
but somewhat lower than Hacioglu and Ward. Zhang and Lee went on to
design an ingenious kernel on question parse trees, which yielded visible gains
for the 6 coarse labels in the UIUC classification system. The accuracy gain
1 http://l2r.cs.uiuc.edu/ ~ cogcomp/Data/QA/QC/
Search WWH ::




Custom Search