Text Search-Enhanced with Types and Entities - Text Mining: Classification, Clustering, and Applications - page 252

Database Reference

In-Depth Information

only the leaf level of the parse tree. Clearly adding non-local features from

higher levels in the tree helps, at least up to level two (but the degradation

thereafter from excess features is small). In fact, Figure 10.9 gives us the

hope that a full parse of the question may not be needed; a parser that can

recover chunk information up to level two, even from grammatically ill-formed

questions, will do fine.

0.85

0.84

0.83

0.82

0.81

Fraction

Jaccard

0.8

0.79

0.78

0

1

2

3

4

5 #Levels

FIGURE 10.9 : A significant boost in question classification accuracy is seen

when two levels of non-local features are provided to the SVM, compared to

just the POS features at the leaf of the parse tree.

EffectofnumberofCRFstates: The last two columns of Figure 10.10

show that the 3-state CRF performs much better than the 2-state CRF. The

gain comes mainly from dicult questions that start with what and which .

In such questions, what and which are not useful in themselves, and the real

clues are surrounded by other important word clues, e.g., “What is the name

of Saturn's largest moon ?” vs. “What mammal lays eggs?” etc. Deciphering

these patterns benefits most from the three-state CRF.

Comparison with heuristic rules: Figure 10.10 also compares the

Jaccard accuracy of informers found by the CRF vs. informers found by

the heuristics described in Section 10.2.2.3. Again we see a clear superiority

of the CRF approach.

Unlike the heuristic approach, the CRF approach is relatively robust to

the parser emitting a somewhat incorrect parse tree, which is not uncommon.

The heuristic approach picks the “easy” informer, who , over the better one,

CEO , in “Who is the CEO of IBM.” Its bias toward the NP-head can also be

a problem, as in “What country's president .... ”

Next Page

Text Mining: Classification, Clustering, and Applications

Search WWH ::

Custom Search

Home