Text Search-Enhanced with Types and Entities - Text Mining: Classification, Clustering, and Applications

Database Reference

In-Depth Information

WHNP: For questions having what and which , use the WHNP if it encloses

a noun. WHNP is the Noun Phrase corresponding to the Wh-word,

given by the Stanford parser.

NP1: Otherwise, for what and which questions, the first (leftmost) noun

phrase is added to yet another feature.

We name apart the features in the cases above, so that there is no ambiguity

regarding the rule that fired to create a feature.

10.2.3 From Type Clue Spans to Answer Types

We will generate features from the whole question as well as the segment

designated as the informer span, but these features will be “named apart”

so that the learner downstream can distinguish between these features.

Figure 10.7 shows the arrangement, an instance of stacked or meta

learning (8). The first-level learner is a CRF, and the second-level learner

is a linear SVM.

CRF Informer

span tagger

class

question

Word and q gram

feature extractor

Informer

feature extractor

Combined feature vector

FIGURE 10.7 : The meta-learning approach.

During training, there are two broad options:

1. For each training question, obtain both the true informer span and

the question class as supervised data. Train the question classifier by

generating features from the known informer span. Independently, train

a CRF as in Section 10.2.2 to identify the informer span. Collecting

training data for this option is tedious because the trainer has to identify

not only the atype but also the informer span for every question.

2. For a relatively small number of questions, provide hand-annotated

informer spans to train the CRF. For a much larger number of questions,

provide only the question class but not the informer span. The trained

CRF is used to choose an informer span which could be potentially

incorrect.

Not only is the second approach less work for the trainer, but it can also give

more robust accuracy when deployed. If the CRF makes systematic mistakes

Search WWH ::

Custom Search

Home