Retrieving Wiki Content Using an Ontology - Mining and Analyzing Social Networks

Information Technology Reference

In-Depth Information

domain. Thus, the organization should perform a few initial retrieval experiments

and configure empirically this threshold, considering the final results of each re-

trieval effort, which are sorted and presented with their relevance indices.

Class families and properties in an ontology affect the final calculations. In ad-

dition, the number of concepts of a domain could be higher than of others, even

for comparable topic sets, concerning relevance distribution. Nevertheless, even if

the final results are values between 0 and 1, for ontologies representing the same

domain or different ones, they will be specific.

6.2 Discrepant Weights

During the tool assessment, it was possible to observe a false-positive case pro-

duced by the retrieval algorithm that is very interesting. A topic part that has no

relevance at all, considering the ontology, was one of the firsts in the relevance

ranking that was produced.

Analyzing the case, the conclusion was that the problem was due to the pres-

ence of one isolated keyword in the topic that was spelled the same way as a con-

cept that was present in the ontology. Coincidently this concept does not appear

anywhere else. This case caused the associated idf to be very high, which influ-

enced the construction of the weight vector for the document equivalent as well as

of the query equivalents vectors. As each concept in the vector represents a coor-

dinate in a multiple space, considering both vectors, the correspondent dimension

was discrepant to the other concept dimensions, in such a way that the cosine of

the angle formed between the correspondent vectors had a very high value.

This discrepant case points out the necessity to include some kind of treatment

in the retrieval algorithm that could avoid highly discrepant weights.

6.3 Distinction between Class Families

In the proposed ontology structure there is no way to specify different weights for

different class families. If implemented, such functionality will become very inter-

esting because it is acceptable that each class family represents an information sub

domain and thus it can be more or less relevant than the other families.

The inclusion of class weights could aggregate a refinement to the information

retrieval mechanism that is used by the implemented tool.

7 Conclusions

In this chapter it is presented an approach to perform semantic information re-

trieval upon wikis. The idea was to provide a tool to follow up news or participa-

tion on consumer discussions. The wiki should contain articles and discussions

that are inserted continuously during a time frame, but its ideas can be ported to

other social media, such as blogs and discussion lists.

The proposed tool can be used in several other scenarios where information re-

trieval is necessary or can be used for improvements. The main differential

to other similar tools and mechanisms is manifold: the semantic nature of the

Search WWH ::

Custom Search

Home