Introduction to Linked Data and Its Lifecycle on the Web - Reasoning Web

Databases Reference

In-Depth Information

ff

reasoners and five di

erent heuristics. The two reasoners are standard Pellet and Pellet

combined with approximate reasoning (not described in detail here). The five heuristics

are those described in Section 6.3. For each configuration of CELOE, we generate at

most 10 suggestions exceeding a heuristic threshold of 90%. Overall, this means that

therecanbeatmost2*5*10

=

100 suggestions per class - usually less, because

di

ed

and presented to the evaluators. For each suggestion, the evaluators can choose between

6 options (see Table 6):

ff

erent settings of CELOE will still result in similar suggestions. This list is shu

1 the suggestion improves the ontology (improvement)

2 the suggestion is no improvement and should not be included (not acceptable) and

3 adding the suggestion would be a modelling error (error)

In the case of existing definitions for class A , we removed them prior to learning. In this

case, the evaluator could choose between three further options:

4 the learned definition is equal to the previous one and both are good (equal

)

5 the learned definition is equal to the previous one and both are bad (equal -) and

6 the learned definition is inferior to the previous one (inferior).

+

We used the default settings of CELOE, e.g. a maximum execution time of 10 seconds

for the algorithm. The knowledge engineers were five experienced members of our re-

search group, who made themselves familiar with the domain of the test ontologies.

Each researcher worked independently and had to make 998 decisions for 92 classes

between one of the options. The time required to make those decisions was approxi-

mately 40 working hours per researcher. The raw agreement value of all evaluators is

0.535 (see e.g. [4] for details) with 4 out of 5 evaluators in strong pairwise agreement

(90%). The evaluation machine was a notebook with a 2 GHz CPU and 3 GB RAM.

Table 6 shows the evaluation results. All ontologies were taken from the Protégé

OWL 42 and TONES 43 repositories. We randomly selected 5 ontologies comprising in-

stance data from these two repositories, specifically the Earthrealm, Finance, Resist,

Economy and Breast Cancer ontologies (see Table 5).

The results in Table 6 show which options were selected by the evaluators. It clearly

indicates that the usage of approximate reasoning is sensible. The results are, however,

more di

erent employed heuristics. Using predic-

tive accuracy did not yield good results and, surprisingly, generalised F-Measure also

had a lower percentage of cases where option 1 was selected. The other three heuris-

tics generated very similar results. One reason is that those heuristics are all based on

precision and recall, but in addition the low quality of some of the randomly selected

test ontologies posed a problem. In cases of too many very severe modelling errors,

e.g. conjunctions and disjunctions mixed up in an ontology or inappropriate domain and

range restrictions, the quality of suggestions decreases for each of the heuristics. This is

the main reason why the results for the di

cult to interpret with regard to the di

ff

erent heuristics are very close. Particularly,

generalised F-Measure can show its strengths mainly for properly designed ontologies.

For instance, column 2 of Table 6 shows that it missed 7% of possible improvements.

ff

42 http://protegewiki.stanford.edu/index.php/Protege_Ontology_Library

43 http://owl.cs.manchester.ac.uk/repository/

Reasoning Web

Search WWH ::

Custom Search

Home