Database Reference
In-Depth Information
to a common ancestor node called person (strictly speaking, person#n#1 ,the
first noun sense of the string person ).
In the above example, if we walked up the WordNet hierarchy and included
all hypernyms (generalizations) of informer tokens in our bag of features,
we would get a much stronger correlation between the informer hypernym
feature person#n#1 and the question class label HUMAN:individual .Inour
implementation we look up an informer token and walk up to more general
types, and include all of them in the bag of features. For example, if
the informer token is CEO , we would include in the feature bag all these
features: corporate_executive#n#1 , executive#n#1 , administrator#n#1 ,
head#n#4 , leader#n#1 , person#n#1 , organism#n#1 , living_thing#n#1 ,
object#n#1 , physical_entity#n#1 , causal_agent#n#1 , entity#n#1 .Some
features, such as beyond person#n#1 above, are too general, and they will
be found to have poor correlation with the class label HUMAN:individual ,
enabling the SVM to ignore them. For informer spans having more than
one token, we look up WordNet not only for individual informer tokens but
also informer q -grams, because some tokens may be part of compounds, as
in “Which breed of hunting dog ...,” “Which European prime minister ...,”
“What is the conversion rate ...” and “Which mountain range ....”
10.2.3.3
Supplementary word features
If informer extraction were perfect, extracting other features from the rest of
the question would appear unnecessary. As we have discussed before, because
the informer span annotator is a learning program, it will make mistakes.
Moreover, we use no word sense disambiguation (WSD) while processing
informer tokens. How long ... may refer to both time and space, and Which
bank ... may be about rivers or financial institutions. When we connect
informer tokens to WordNet and expand to ancestors, we may amplify the
ambiguities.
For the above reasons, it is a good idea to include additional features from
regular question words. The word feature extractor selects unigrams and q -
grams from the question. In our experiments, q =1or q =2workedbest;
but, if unspecified, all possible q grams were used. As with informers, we can
also use hypernyms of regular words as SVM features.
10.2.4 Experiments
To keep our performance numbers directly comparable to earlier work, we
used the dataset from UIUC 3 (27) that is now somewhat standard in question
classification work. It has 6 coarse and 50 fine answer types in a two-level
taxonomy, together with 5500 training and 500 test questions. We had two
volunteers independently tag the 6000 UIUC questions with informer spans.
3 http://l2r.cs.uiuc.edu/ ~ cogcomp/Data/QA/QC/
 
Search WWH ::




Custom Search