Information Technology Reference
In-Depth Information
Velardi et al. [27] describe an approach for the evaluation of an ontology learning
system which takes a body of natural-language text and tries to extract from it
relevant domain-specific concepts (terms and phrases), and then find definitions for
them (using web searches and WordNet entries) and connect some of the concepts
by is-a relations. Part of their evaluation approach is to generate natural-language
glosses for multiple-word terms. The glosses are of the form “ xy = a kind of y ,
definition of y , related to the x , definition of x ,” where y is typically a noun and
x is a modifier such as another noun or an adjective. A gloss like this would then
be shown to human domain experts, who would evaluate it to see if the word sense
disambiguation algorithm selected the correct definitions of x and y . An advantage
of this kind of approach is that domain experts might be unfamiliar with formal
languages in which ontologies are commonly described, and thus it might be easier
for them to evaluate the natural-language glosses. Of course, the downside of this
approach is that it nevertheless requires a lot of work on part of the domain experts.
11.2.3 Evaluation of Taxonomic and Other Semantic Relations
Brewster et al. [2] suggested using a data-driven approach to evaluate the degree of
structural fit between an ontology and a corpus of documents. (1) Given a corpus of
documents from the domain of interest, a clustering algorithm based on expectation
maximization is used to determine, in an unsupervised way, a probabilistic mixture
model of hidden “topics” such that each document can be modeled as having been
generated by a mixture of topics. (2) Each concept c of the ontology is represented
by a set of terms including its name in the ontology and the hypernyms of this name,
taken from WordNet. (11.3) The probabilistic models obtained during clustering can
be used to measure, for each topic identified by the clustering algorithm, how well the
concept c fits that topic. (11.4) At this point, if we require that each concept fits at
least some topic reasonably well, we obtain a technique for lexical-level evaluation of
the ontology. Alternatively, we may require that concepts associated with the same
topic should be closely related in the ontology (via is-a and possibly other relations).
This would indicate that the structure of the ontology is reasonably well aligned
with the hidden structure of topics in the domain-specific corpus of documents. A
drawback of this method as an approach for evaluating relations is that it is di cult
to take the directionality of relations into account. For example, given concepts
c 1 and c 2 , the probabilistic models obtained during clustering in step (1) may be
enough to infer that they should be related, but they are not really su cient to infer
whether e.g., c 1 is-a c 2 ,or c 2 is-a c 1 , or if they should in fact be connected by some
other relation rather than is-a.
Given a gold standard, evaluation of an ontology on the relational level can also
be based on precision and recall measures. Spyns [25] discusses an approach for
automatically extracting a set of lexons, i.e., triples of the form term 1 , role, term 2 ,
from natural-language text. The result can be interpreted as an ontology, with terms
corresponding to concepts and roles corresponding to (non-hierarchical) relations
between concepts. Evaluation was based on precision and recall, comparing the
ontology either with a human-provided gold standard, or with a list of statistically
relevant terms. The downside of this approach is again the need for a lot of manual
human work involved in preparing the gold standard.
A somewhat different aspect of ontology evaluation has been discussed by Guar-
ino and Welty [11]. They point out several philosophical notions (essentiality, rigid-
Search WWH ::




Custom Search