Databases Reference
In-Depth Information
To the best of our knowledge, AgreementMaker [ Cruz et al. 2007 , 2009 ]isthe
only tool that enables users to select a type of cardinality to be discovered. Given
two input schemas, mapping cardinality is either 1:1 , 1:n , n:1 or n:m .Ina 1:1 con-
figuration, the matcher is limited to discover mappings between one element of the
first schema and one element in the other schema. Only a few matchers empha-
size the complex mappings such as n:m , in which any number of elements in both
schemas can be involved in a mapping.
3.5
Conclusion
In this section, we have mainly presented user inputs, i.e., optional preferences and
parameters applied to data. To sum up, the quality can be improved by using exter-
nal resources and expert feedback . Several tools are based on machine learning
techniques either as a similarity measure (mostly at the instance level) or as a means
of combining the results of similarity measures. In both cases, training data is a
crucial issue. Finally, many tools propose preferences or options which add more
flexibility or may improve the matching quality. The next section focuses on the
parameters at the similarity measure level.
4
Similarity Measures Parameters
Similarity measures are the basic components of schema matchers. They can be used
as individual algorithms or combined with an aggregation function. Consequently,
they may have internal parameters. In most cases, schema matchers do not enable
users to tune such low-level parameters. Another parameter applied to similarity
measures is the threhold. It filters the pair of schema elements in different categories
(e.g., is a correspondence, or should apply another type of similarity measure) based
on the output of the similarity measures. The last part of this section is dedicated to
parameters specific to one or several matchers.
4.1
Internal Parameters
Similarity measures takes as input two schema elements, and it outputs a similarity
between them. This similarity value may be a numerical value (e.g., a distance, a
real in the range [0, 1]) or a relationship (e.g., equivalence, generalization). Similar
to black-box algorithms, similarity measures can have internal parameters which
impact the output. Due to the numerous available similarity measures, we do not
intend to describe all of them with their parameters. Thus, we focus on two simple
examples to illlustrate various types of such internal parameters.
Search WWH ::




Custom Search