Text Search-Enhanced with Types and Entities - Text Mining: Classification, Clustering, and Applications

Database Reference

In-Depth Information

to a common ancestor node called person (strictly speaking, person#n#1 ,the

first noun sense of the string person ).

In the above example, if we walked up the WordNet hierarchy and included

all hypernyms (generalizations) of informer tokens in our bag of features,

we would get a much stronger correlation between the informer hypernym

feature person#n#1 and the question class label HUMAN:individual .Inour

implementation we look up an informer token and walk up to more general

types, and include all of them in the bag of features. For example, if

the informer token is CEO , we would include in the feature bag all these

features: corporate_executive#n#1 , executive#n#1 , administrator#n#1 ,

head#n#4 , leader#n#1 , person#n#1 , organism#n#1 , living_thing#n#1 ,

object#n#1 , physical_entity#n#1 , causal_agent#n#1 , entity#n#1 .Some

features, such as beyond person#n#1 above, are too general, and they will

be found to have poor correlation with the class label HUMAN:individual ,

enabling the SVM to ignore them. For informer spans having more than

one token, we look up WordNet not only for individual informer tokens but

also informer q -grams, because some tokens may be part of compounds, as

in “Which breed of hunting dog ...,” “Which European prime minister ...,”

“What is the conversion rate ...” and “Which mountain range ....”

10.2.3.3

Supplementary word features

If informer extraction were perfect, extracting other features from the rest of

the question would appear unnecessary. As we have discussed before, because

the informer span annotator is a learning program, it will make mistakes.

Moreover, we use no word sense disambiguation (WSD) while processing

informer tokens. How long ... may refer to both time and space, and Which

bank ... may be about rivers or financial institutions. When we connect

informer tokens to WordNet and expand to ancestors, we may amplify the

ambiguities.

For the above reasons, it is a good idea to include additional features from

regular question words. The word feature extractor selects unigrams and q -

grams from the question. In our experiments, q =1or q =2workedbest;

but, if unspecified, all possible q grams were used. As with informers, we can

also use hypernyms of regular words as SVM features.

10.2.4 Experiments

To keep our performance numbers directly comparable to earlier work, we

used the dataset from UIUC 3 (27) that is now somewhat standard in question

classification work. It has 6 coarse and 50 fine answer types in a two-level

taxonomy, together with 5500 training and 500 test questions. We had two

volunteers independently tag the 6000 UIUC questions with informer spans.

3 http://l2r.cs.uiuc.edu/ ~ cogcomp/Data/QA/QC/

Search WWH ::

Custom Search

Home