Database Reference
In-Depth Information
only the leaf level of the parse tree. Clearly adding non-local features from
higher levels in the tree helps, at least up to level two (but the degradation
thereafter from excess features is small). In fact, Figure 10.9 gives us the
hope that a full parse of the question may not be needed; a parser that can
recover chunk information up to level two, even from grammatically ill-formed
questions, will do fine.
0.85
0.84
0.83
0.82
0.81
Fraction
Jaccard
0.8
0.79
0.78
0
1
2
3
4
5 #Levels
FIGURE 10.9 : A significant boost in question classification accuracy is seen
when two levels of non-local features are provided to the SVM, compared to
just the POS features at the leaf of the parse tree.
EffectofnumberofCRFstates: The last two columns of Figure 10.10
show that the 3-state CRF performs much better than the 2-state CRF. The
gain comes mainly from dicult questions that start with what and which .
In such questions, what and which are not useful in themselves, and the real
clues are surrounded by other important word clues, e.g., “What is the name
of Saturn's largest moon ?” vs. “What mammal lays eggs?” etc. Deciphering
these patterns benefits most from the three-state CRF.
Comparison with heuristic rules: Figure 10.10 also compares the
Jaccard accuracy of informers found by the CRF vs. informers found by
the heuristics described in Section 10.2.2.3. Again we see a clear superiority
of the CRF approach.
Unlike the heuristic approach, the CRF approach is relatively robust to
the parser emitting a somewhat incorrect parse tree, which is not uncommon.
The heuristic approach picks the “easy” informer, who , over the better one,
CEO , in “Who is the CEO of IBM.” Its bias toward the NP-head can also be
a problem, as in “What country's president ....
 
 
Search WWH ::




Custom Search