Databases Reference
In-Depth Information
Mapper, which is embedded in Microsoft Visual Studio [ Microsoft 2005 ], Stylus
Studio[ Stylus Studio 2005 ], BEA AquaLogic [ Carey 2006 ], and the research proto-
types Rondo [ Do and Rahm 2002 ], COMA
[ Aumueller et al. 2005 ], Harmony
[ Mork et al. 2008 ], S-Match [ Giunchiglia et al. 2005 ], Cupid [ Madhavan et al.
2001 ], Clio [ Popa et al. 2002 ], Tupelo [ Fletcher and Wyss 2006 ], Spicy [ Bonifati
et al. 2008a ], and HePToX [ Bonifati et al. 2010 ].
Despite the availability of the many mapping tools, no generally accepted bench-
mark has been developed for comparing and evaluating them. As it is the case with
other benchmarks, such a development is of major importance for assessing the rel-
ative merits of the tools. This can help customers in making the right investment
decisions and selecting among the many alternatives the tools that better fit their
business needs. A benchmark can also help the mapping tool developers as it offers
them a common metric to compare their own achievements against those of the com-
petitors. Such comparisons can boost competition and drive the development toward
systems of higher quality. A benchmark is also offering the developers a generally
accepted language for talking to customers and describing the advantages of their
tools through well-known features that determine performance, effectiveness, and
usability. Furthermore, the benchmark can highlight limitations of the mapping tools
or unsupported features that may not have been realized by the developers. Finally,
a benchmark is also needed in research community [ Bertinoro 2007 ]. Apart from a
common platform for comparison, a benchmark allows researchers to evaluate their
achievements not only in terms of performance but also in terms of applicability in
real-world situations.
In this work, we summarize and present in a systematic way existing efforts
toward the characterization and evaluation of mapping tools, and the establishment
of a benchmark. After a quick introduction of the architecture and main functionality
of matching and mapping tools in Sect. 2 , we describe the challenges of building a
matching/mapping system benchmark in Sect. 3 . Section 4 presents existing efforts
in collecting real-world test cases with the intention of using them in evaluating the
matching and mapping systems. Section 5 addresses the issue of creating synthetic
test cases that are targeting the evaluation of specific features of the mapping sys-
tems. Finally, Sects. 6 and 7 present different metrics that have been proposed in
the literature for measuring the efficiency and effectiveness of matching/mapping
systems, respectively.
CC
2
The Matching and Mapping Problem
Matching is the process that takes as input two schemas, referred to as the source
and the target , and produces a number of matches, aka correspondences , between
the elements of these two schemas [ Rahm and Bernstein 2001 ]. The term schema is
used with the broader sense and includes database schemas [ Madhavan et al. 2001 ],
ontologies [ Giunchiglia et al. 2009 ], or generic models [ Atzeni and Torlone 1995 ].
A match is defined as a triple
h
S s , E t , e
i
,where S s is a set of elements from the
Search WWH ::




Custom Search