Information Technology Reference
In-Depth Information
4.3 Evaluation in a Use Case
In this section, we set a specific use case and conduct an evaluation on the
accuracy improvement and the context registration.
Accuracy Improvement. As a use case where the question-answering sys-
tem is useful, we focused on a situation where the user asks the reason for the
problems of plant growth such as physiological disorder, disease and pest. We
collected questions on plant growth from Q&A sites such as Yahoo! Answers,
and translated some of them to the spoken language as a dataset. Examples are
as follows: Why have the leaves of black pine died? / Does corn need fertilizer? /
Where is a suitable space for sage? / Why has the bark of apple become brown?
/ Does leaf curl affect passion fruit? / Why have some edges of the leaves of
kalanchoe been dying? / Is tomato with white patterns safe to eat? However, we
have limited the questions to those for the plant species registered in the LOD.
In the experiment, we divided 90 collected questions into 10 sets. Then, we ran-
domly selected and evaluated the first set and the next set consecutively as a
test. We gave the correct feedback (the correct answer was also retrieved from
the Q&A sites), which means the registration of
mapping and
incrementation of the confidence value, to one of the three answers per query.
After the evaluation of the second set, we cleared all the effect of the user feed-
back, and repeated the above from the first set. The difference of the accuracy
between the first and the second corresponds to the improvement of the user
feedback. We assumed that the query sentence is correctly entered and did not
consider voice recognition error, since we can select the correct sentence from
the results of the Google voice recognition. The result is shown in table 2 .
{
verb, property
}
Table 2. Accuracy of search
Fa l se
True
no Prop.
triplification error
1-best
3-best
1st Set (avg.)
55.6%
66.7%
22.2%
11.1%
2nd Set (avg.)
66.7%
66.7%
In the table, the result of False shows the average of the first set and the sec-
ond set. We found that the coverage of the prepared Properties remains approx.
80% of the questions. We plan to expand the Properties defined in the Plant
Cultivation LOD. In addition, the accuracy of the conversion from a sentence
to a triple (triplification) was rather high, almost 90%. The current extraction
mechanism is rule-based, but we intend to extend the rules and use of machine
learning techniques to manage the broader questions. On the other hand, the
result of True was about 67% of the accuracy in 3-best. By comparing the 1-
best result for the first set with the second one, we can confirm that the problem
raised in section 4.1 , the mapping of the verb in the question to the LOD schema,
has been improved about 10% by the user feedback. (Note that 1-best accuracy
 
Search WWH ::




Custom Search