Databases Reference
In-Depth Information
1
Introduction
The gap between manual schema matching and semi-automatic schema matching
has been filled in early, especially because of the need to handle large schemas
and to accelerate the matching process [ Carmel et al. 2007 ]. The next step towards
automatic schema matching is mainly motivated by the lack of human experts, for
instance in dynamic environments. In all cases, tuning is mainly required to improve
quality results and/or time performance. We illustrate this statement with Fig. 10.1 ,
on which four schema matchers (YAM, COMA, Similarity Flooding and YAM with
tuning) have been run on the same scenario. Only one of them has been tuned and
this plot compares the number of user interactions to obtain a 100% F-measure man-
ually. In brief, a user interaction is a user (in)validation for a given pair of schema
elements [ Bellahsene et al. 2011 ]. When a tool discovers many relevant correspon-
dences, the user has less interactions to correct and find the missing ones. The plot
clearly shows that the tuned matcher improves quality and consequently reduces
post-match effort.
Tuning, either automatic or manual, is performed during the pre-match phase of
the schema matching process. The main motivation for tuning a schema matcher
deals with the difficulty to know in advance, even for a human expert, the best con-
figuration of the parameters for a given set of schemas. The heterogeneity, structure
and domain specificity encompassed in every set of schemas to be matched make it
more difficult for a schema matcher to achieve acceptable results in all cases. Thus,
tuning enables schema matchers to provide flexibility and customization to cope
with the different features of each set of schemas.
100
90
80
70
60
50
40
30
20
YAM
YAM-tuned
COMA++
SF
10
0
0
500
1000
1500
2000
2500
number of interactions
Fig. 10.1
A tuned matcher mainly improves matching quality
 
Search WWH ::




Custom Search