Databases Reference
In-Depth Information
are more difficult to identify than 1:1 correspondences that are more likely for small
match tasks. Some large-scale problems in the ontology alignment evaluation ini-
tiative (OAEI) contest on ontology matching are still not satisfactorily solved after
several years. For example, the best F-measure 1 result for the catalog test to match
web directories (71%) was achieved in 2007; in 2009, the best participating sys-
tem achieved merely 63%; the average F-measure was around 50% ( Euzenat et al.
2009 ).
Efficiency is another challenge for large-scale matching. Current match systems
often require the schemas and intermediate match results to fit in main memory,
thereby limiting their applicability for large-scale match tasks. Furthermore, eval-
uating large search spaces is time consuming, especially if multiple matchers need
to be evaluated and combined. For some OAEI match tasks and systems, execution
times in the order of several hours or even days are observed ( Euzenat et al. 2009 ).
For interactive use of schema matching systems, such execution times are clearly
unacceptable.
In this topic chapter, we provide an overview of recent approaches to improve
effectiveness and efficiency for large-scale schema and ontology matching. We only
briefly discuss further challenges such as support for sophisticated user interaction
or the evaluation of match quality, but these are treated in more detail in other chap-
ters of this topic ( Falconer and Noy 2011 ; Bellahsene et al. 2011 ). For example,
advanced GUIs should be supported to visualize large schemas and mappings, to
specify automatic match strategies (selection of matchers, parameter tuning), to
incrementally start automatic schema matching and adapt match results, etc.
In the next section, we introduce the kinds of matchers used in current match
systems as well as a general workflow to combine the results of multiple match-
ers for improved match quality. We also discuss performance optimizations for
single matchers and present recently proposed approaches for instance-based and
usage-based matching. In Sect. 3 , we present several match strategies that we con-
sider as especially promising for large-scale matching: early pruning of the search
space, partition-based matching, parallel matching, self-tuning match workflows,
and reuse-based matching. We also discuss briefly approaches for
-way (holistic)
schema matching. Section 4 contains a short discussion of match support in com-
mercial systems and a comparison of selected research prototypes that have been
applied to large match problems.
n
2
Matchers and Match Workflows
The developed systems for schema and ontology matching typically support several
match algorithms (or matchers) and combine their results for improved match qual-
ity. There exists a large spectrum of possible matchers and different implementations
1 F-Measure combines Recall and Precision, two standard measures to evaluate the effectiveness
of schema matching approaches ( Do et al. 2003 ).
Search WWH ::




Custom Search