Information Technology Reference
In-Depth Information
correct result set in 82% of the cases. For the other questions, it can be found
within the first 10 generated answers for 99% of the questions (once the 33
questions above have been removed). This can be observed in Figure 6, which
plots the Recall (of the correct question) curve of the generative approach, i.e.,
the baseline. As pointed out in the graphic, the right query is found among the
first three in 93% of the cases.
6.3 Reranking Results
Figure 6 also shows the plot for different rerankers using the following kernels:
STK+STK, STK
STK) 2 , which provide better rankings
(the first STK is applied to the question parse trees whereas the second STK
is applied to the query derivation tree). For example, the latter kernel retrieved
the correct answers 94% of times by only using the first two answers.
To better evaluate the results of our rerankers, we applied standard 10-fold
cross validation and measure the average Recall and Std Dev. of selecting a
query for each question. The results for different kernel models for reranking
are reported in Table 2. The first column of Table 2 lists kernel combination by
means of product and sum between pairs of basic kernels used for the question
and the query, respectively. The other columns show the percentage of questions
for which we found at least 1 correct answer in the top @X positions (average
Recall@X over 10 folds
×
STK and (1+STK
×
Std. Dev).
The results are rather exciting since they compare favorably with the state-
of-the-art. The best system on this datasets was designed in [15] and shows a
Precision of 96.3% and a Recall of 79.3%, for an f-measure of 86.9%, while our
system shows a Precision of 82.8% and a Recall of 87.2%, for an f-measure of
85.0% (when we include the 33 missing questions in the evaluation). Two main
facts should be noted:
±
- our system performs just 2 points less than the system designed in [15]
but it does not need any hand-crafted manual resource, i.e., the semantic
trees manually designed in [15] for each question, and it is very simple to
implement.
- unlike it has been done in previous work, we can also provide multiple ranked
answers. If we select the first n candidates, we highly increasing the Recall
Tabl e 2. Kernel combination recall ( ± Std. Dev) for Geo dataset
Co m bination Rec@1 Rec@2 Rec@3 Rec@4 Rec@5
NO RERANKING 81.4 ± 5.8 87.6 ± 3.8 90.8 ± 3.1 94.0 ± 2.4 95.0 ± 2.0
STK+STK
83.5 ± 3.6 90.4 ± 3.5 94.2 ± 2.9 95.8 ± 2.0 96.7 ± 1.7
STK × STK
86.5 ± 4.0 92.6 ± 3.7 95.3 ± 3.2 97.0 ± 1.8 97.7 ± 1.4
(1+STK 2 ) 2
87.2 ± 3.9 94.1 ± 3.4 95.6 ± 2.7 97.1 ± 1.9 97.9 ± 1.4
BOW × STK
86.7 ± 4.1 92.1 ± 3.2 95.6 ± 2.5 97.1 ± 1.4 97.6 ± 1.2
 
Search WWH ::




Custom Search