Text Search-Enhanced with Types and Entities - Text Mining: Classification, Clustering, and Applications - page 262

Database Reference

In-Depth Information

1

Rough

Smooth

0.8

0.6

0.4

0.2

0

Gap j

0

10

20

30

40

50

FIGURE 10.15 : β j shows a noisy unimodal pattern.

10.3.3.4 Accuracy using the fitted decay

Finally, we plug in the smooth β in place of decay and make an end-to-end

evaluation of the snippet ranking system. In a standard IR system (39), the

score of a snippet would be decided by a vector space model using selectors

alone. We gave the standard score the additional benefit of considering only

those snippets centered at an atype candidate, and considering each matched

selector only once (i.e., use only IDF and not TF). Even so, a basic IR scoring

approach was significantly worse than the result of plugging in β ,asshownin

Figure 10.16. “R300” is the fraction of truly relevant snippets recovered within

the first 300 positions. The “reciprocal rank” for a fixed question is one divided

by the first rank at which an answer snippet was found. Mean reciprocal rank

or MRR is the above averaged over queries. Both recall and MRR over held-

outtestdataimprovesubstantially compared to the IR baseline.

β from

Train

Test

R300

MRR

IR-IDF

-

2000

211

0.16

RankExp

1999

2000

231

0.27

RankExp

2000

2000

235

0.31

RankExp

2001

2000

235

0.29

FIGURE 10.16 : End-to-end accuracy using RankExp β is significantly

better than IR-style ranking. Train and test years are from 1999, 2000, 2001.

R300 is recall at k = 300 out of 261 test questions. C =0 . 1, C =1and

C = 10 gave almost identical results.

Observe that we used three years of TREC data (1999, 2000, 2001) for

training and one year (2000) for testing. The accuracy listed for training year

2000 is meant only for sanity-checking because the training set is the same as

Next Page

Text Mining: Classification, Clustering, and Applications

Search WWH ::

Custom Search

Home