Databases Reference
In-Depth Information
between these two alternatives. It has a matching module similar to the one in Clio
but is used as an integral part of the tool, allowing it to accept as input only the pair
of source and target schema, if needed.
Since matching and mapping tools try to guess the intentions of the designer
based on the provided input, it is natural to assume that their output is not always
the one anticipated by the designer. As already mentioned, an a posteriori verifi-
cation is necessary. Nevertheless, there is a significant number of tools that allow
the active participation of the designer in the matching/mapping generation phase
to guide the whole process and arrive faster at the desired result. For example, once
some matchings/mappings have been generated, the designer can verify their cor-
rectness. If she feels unsatisfied by the result, she can go back and modify some
intermediate steps, for instance, she can tune the matcher, select a fraction of the
set of the generated matches, enhance the matches by introducing new matches not
automatically generated by the matcher, tune the mapping generation process by
accepting only a fraction of the generated mappings, or even edit directly the map-
pings. User participation is highly active in Tupelo [ Fletcher and Wyss 2006 ]where
mapping generation is studied as a search problem driven by input data examples.
Domain knowledge, that is usually an input to the matcher, is also used as input to
the mapping discovery module. User feedback can be used to improve the effective-
ness of the discovered semantic functions, i.e., the matches, and of the structural
relationships, i.e., the mapping dependencies, that in turn can be entrusted to a data
mapping module for generating the final transformation query.
Many mapping tools are used as schema integration tools. Schema integration
is the process of merging multiple source schemas into one integrated schema, aka
the global or mediated schema. The integrated schema serves as a uniform inter-
face for querying the data sources. Nowadays, construction of integrated schemas
has become a laborious task mainly due to the number, size, and complexity of the
schemas. On the other hand, decision makers need to understand, combine, and
exploit in a very short time all the information that is available to them before
acting [ Smith et al. 2009 ]. This reality requires the rapid construction of large
prototypes and the flexible evolution of existing integrated schemas from users
with limited technical expertise. Matching and mapping tools facilitate that goal.
A mapping designer may be presented with a number of source schemas and an
empty target. Through a graphical interface, source schema elements can be selected
and “dropped” into the target. When the elements are dropped into the target, the
mappings specifying how the target elements are related to those in the sources
are automatically or semiautomatically generated. This functionality is graphically
depicted in Fig. 9.2 . Note that schema integration involves additional tasks; however,
here we concentrate only on the part related to matching and mapping.
A special case of mapping tools are the ETL systems. An ETL system is a tool
designed to perform large-scale extract-transform-load operations. The transfor-
mation performed by an ETL system is typically described by a graph flowchart
in which each node represents a specific primitive transformation and the edges
between the nodes represent flow of data produced as a result of a primitive oper-
ator and fed as input in another. Figure 9.3 illustrates such a data flowchart. The
Search WWH ::




Custom Search