Information Technology Reference
In-Depth Information
grades and for the factor from object properties. A final relevance value is ob-
tained for each topic part through the arithmetic mean between the similarity aver-
age and the factor average. It is expected that the greatest final relevance values
would come from the topic parts whose contents are semantically closer to the
domain of interest that is represented in the ontology.
5 Assessment
A semantic-oriented retrieval tool should be satisfactory to any user. For the de-
veloped tool, it was defined that it should present at least 80% of both precision
and recall, which are well known metrics usually used to assess retrieval processes
and tools [18].
Precision is computed as the ratio between the amount of retrieved documents
that are relevant and the total amount of retrieved documents. It indicates the ca-
pacity to keep out irrelevant documents from the final result. Recall is the ratio be-
tween the amount of retrieved documents that are relevant and the amount of
documents that should be retrieved. It represents the capacity to retrieve relevant
documents.
To assess the developed tool and its retrieval algorithm, a wiki was populated
with topics whose content and the expected retrieved result were previously
known, considering the defined ontology.
The wiki subject was concerned with a general issue that has to be discussed by
a shopper community. Part of the inserted topics was not related to this subject.
Thus, the wiki contained relevant and irrelevant information.
The wiki content was composed by 35 topics, with 14 of these, 40% of the
topic parts, highly relevant. The other 60% contained other non-related subjects.
Before beginning a retrieval process, the developed tool was configured. More
information on this follows in the discussion section.
6 Discussion
The formulas presented previously can be modified. A configuration facility is
present in the developed tool that offers the opportunity to reach better and more
stable results, which should be different accordingly to the domain of interest. It
also will allow the behavior study of the proposed algorithm.
The authors believe that the main influence to obtain relevant results is due to
the magnitude of the proposed relevance indices, as can be perceived next.
6.1 Magnitude of Calculated Relevance Indices
One of the main difficulties on using the retrieval algorithm is to understand what
relevance indices obtained in a retrieval process mean.
What threshold value, for an index to define a topic part as relevant or not in
the domain of interest, is an open question, which should be analyzed for each
domain case. In the developed tool, this threshold can be configured to each
Search WWH ::




Custom Search