Discovery and Correctness of Schema Mapping Transformations - Schema Matching and Mapping

Databases Reference

In-Depth Information

tasks. This section provides an overview of the developments in mapping gener-

ation since the very first need of data transformations, until the development of

the first schema mapping tools under the form they are widely understood today.

Having defined the data exchange problem, this section describes how a mapping

scenario can be constructed. The presented algorithm, which is the basis of the

Clio [ Popa et al. 2002 ] mapping scenario generation mechanism, has the additional

advantage that generates scenarios in which the mappings respect the target schema

constraints. In that sense, generating the target instance can be done by taking into

consideration only the mappings of the mapping scenario and not the target schema

constraints. This kind of mappings are more expressive that other formalisms such

as simple correspondence lines [ Rahm and Bernstein 2001 ] or morphisms [ Melnik

et al. 2005 ].

3.1

The First Data Translation Systems

Since the beginning of data integration, a major challenge has been the ability to

translate data from one format to another. This problem of data translation has

been studied for many years, in different variants and under different assumptions.

One of the first systems was EXPRESS [ Shu et al. 1977 ], a system developed by

IBM. A series of similar but more advanced tools have followed EXPRESS. The

TXL language [ Abu-Hamdeh et al. 1994 ], initially designed to describe syntactic

software transformations, offered a richer set of operations and soon became pop-

ular in the data management community. It was based on transformation rules that

were fired upon successful parsing of the input data. The problem became more

challenging when data had to be transformed across different data models, a situa-

tion that was typically met in wrapper construction [ Tork-Roth and Schwarz 1997 ].

MDM [ Atzeni and Torlone 1997 ] was a system for this kind of transformations that

was based on patterns [ Atzeni and Torlone 1995 ].

Some later works [ Beeri and Milo 1999 ] proposed a tree-structured data model

for describing schemas, and showed that the model was expressive enough to rep-

resent relational and XML schemas, paving the way for the later introduction of

tree-based transformations. A formal foundation for data translation was created,

alongside a declarative framework for data translation [ Abiteboul et al. 1997 ]. Based

on this work, the TranScm system [ Milo and Zohar 1998 ] used a library of transfor-

mation rules and pattern matching techniques to select the most applicable rules

between two schemas, in an effort to automate the whole data translation task.

Other transformation languages developed in parallel emphasized on the type check-

ing [ Cluet et al. 1998 ] task or on integrity constraint satisfaction [ Davidson and

Kosky 1997 ].

Search WWH ::

Custom Search

Home