Databases Reference
In-Depth Information
CHAPTER
1
Introduction
Doubt is not a pleasant condition, but certainty is absurd.
- Voltaire
Schema matching is the task of providing correspondences between concepts describing
the meaning of data in various heterogeneous, distributed data sources ( e.g. attributes in database
schemata, tags in XML DTDs, fields in HTML forms, input and output parameters in Web services,
etc. ). Schema matching is one of the basic operations required by the process of data and schema
integration [ Batini et al. , 1986 , Bernstein and Melnik , 2004 , Lenzerini , 2002 ], and thus has great
effect on its outcomes, whether these involve targeted content delivery, view integration, database
integration, query rewriting over heterogeneous sources, duplicate data elimination, or automatic
streamlining of workflow activities that involve heterogeneous data sources. As such, schema match-
ing affects numerous modern applications from a wide variety of areas. It impacts business, where
company data sources continuously realign due to changing markets; and it affects the way busi-
ness and other information consumers seek information over the Web. It impacts the life sciences,
where scientific workflows cross system boundaries more often than not. Finally, it impacts the way
communities of knowledge are created and evolve.
Schema matching research has been going on for more than 25 years now, first as part
of schema integration and then as a standalone research field (see surveys [ Batini et al. , 1986 ,
Rahm and Bernstein , 2001 , Sheth and Larson , 1990 , Shvaiko and Euzenat , 2005 ] and online lists,
e.g. , OntologyMatching 1 and Ziegler 2 ). Over the years, a significant body of work has been de-
voted to the identification of schema matchers , heuristics for schema matching. The main objective of
schema matchers is to provide correspondences that will be effective from the user's point of view, yet
computationally efficient (or at least not disastrously expensive). Examples of algorithmic tools used
for schema matching include COMA [ Do and Rahm , 2002 ], Cupid [ Madhavan et al. , 2001 ], Onto-
Builder [ Gal et al. , 2005b ], Autoplex [ Berlin and Motro , 2001 ], Similarity Flooding [ Melnik et al. ,
2003 ], Clio [ Miller et al. , 2001 ], Glue [ Doan et al. , 2002 ], and others [ Bergamaschi et al. , 2001 ,
Castano et al. , 2001 , Saleem et al. , 2007 ]. These have come out of a variety of different research
communities, including database management, information retrieval, the information sciences, data
semantics and the semantic Web, and others. Research papers from different communities have
1 http://www.ontologymatching.org/
2 http://www.ifi.unizh.ch/˜pziegler/IntegrationProjects.html
 
Search WWH ::




Custom Search