On Evaluating Schema Matching and Mapping - Schema Matching and Mapping

Databases Reference

In-Depth Information

2

1

0:69

f

measure COMA CC D

D

82 %

1

C

0:69

and

2

1

0:54

D

70 %

f

measure SF D

1

C

0:54

[Fallout] Another metric that is often used in the literature is the fallout [ Euzenat

et al. 2006 ][ Ferrara et al. 2008 ]. It computes the rate of incorrectly discovered

matches out of the number of those nonexpected ones. Intuitively, it measures the

probability that a irrelevant match is discovered by the tool. The fallout is defined

by the following formula:

FP

D

Fallout

FP

C

TN

In the running example of Fig. 9.8 , the number of nonexpected, i.e., irrelevant,

matches equals 253 (there exist a total of 266 possible matches including the 13

that are relevant). However, since neither tool discovered any irrelevant match, their

fallout equals to 0 %.

0

Fallout COMA CC D

253 D

0 %

Fallout SF D

253 D

0 %

0

C

0

C

The matching benchmark XBenchMatch [ Duchateau et al. 2007 ] and the ontol-

ogy alignment API [ Euzenat 2004 ] are based on the above metrics to evaluate the

effectiveness of matching tools. They assume the availability of the expected set of

matches through an expert user. Based on that set and the matches that the matching

tool produces, the various values of the metrics are computed.

A limitation of the above metrics is that they do not take into consideration any

postmatch user effort, for instance, tasks that the user may need to do to guide the

matching tool in the matching process, or any iterations the user may perform to

verify partially generated results.

Measuring the quality of mappings turns out to be more challenging than mea-

suring the quality of the matches. The reason is that it requires comparisons among

mappings, which is not a straightforward task. Finding whether a generated map-

ping belongs to the set of expected mappings requires a comparison between this

mapping and every other mapping in that set. This comparison boils down to query

equivalence. Apart from the fact that query equivalence is a hard task per se, it

is also the case that a transformation described by a mapping may be also imple-

mented through a combination of more than one different mapping. This means

that it is not enough to compare with individual mappings only, but combinations

of mappings should also be considered. For this reason, direct mapping compari-

son has typically been avoided as evaluation method of mapping tools. Researchers

Schema Matching and Mapping

Search WWH ::

Custom Search

Home