Discovery and Correctness of Schema Mapping Transformations - Schema Matching and Mapping

Databases Reference

In-Depth Information

for the unspecified elements is the result if a Skolem function that gets as arguments

the values of all the source schema elements of the source. The annotations that

have been made on the set elements are used to create Skolem functions that drive

the right nesting. More details can be found in Fagin et al. [ 2009 ].

At this point, the final queries, or transformation scripts in general, can be

constructed. First, the variable of every unspecified target schema element in the

mapping is replaced by its Skolem function expression. In its simplest brute-force

form, the final query is generated by first executing the query described on the left-

hand side of the mapping tgd expression for every nesting level, i.e., for every set

element of any nesting depth of the target. Then, the Skolem functions that have

been computed for the set elements of the target are used to partition the result set

of these queries and place them nested under the right elements. The full details of

this task can be found in Fagin et al. [ 2009 ].

4

Second-Generation Mapping Systems

Inspired by the seminal papers about the first schema mapping system [ Miller et al.

2000 ; Popa et al. 2002 ], in the following years a rich body of research has pro-

posed algorithms and tools to improve the easiness of use of mapping systems [ An

et al. 2007 ; Raffio et al. 2008 ; Cabibbo 2009 ; Mecca et al. 2009b ] (see Sect. 7 and

Chap. 9) and the quality of the solutions they produce. As experimentally shown

in Fuxman et al. [ 2006 ]; Mecca et al. [ 2009a ], different solutions for the same sce-

nario may differ significantly in size and for large source instances the amount of

redundancy in the target produced by first generation mapping systems may be very

large, thus impairing the efficiency of the exchange and the query answering pro-

cess. Since the core is the smallest among the solutions that preserve the semantics

of the exchange, it is considered a desirable requirement for a schema mapping

system to generate executable scripts that materialize core solutions for a mapping

scenario.

In this section, we present results related to this latest issue and we show

how novel algorithms for mapping generation and rewriting have progressively

addressed the challenge of producing the best solution for data exchange.

4.1

Problems with Canonical Solutions

To see how translating data with mapping systems from a given source database

may bring to a certain amount of redundancy into the target data, consider again

the mapping scenario in Fig. 5.2 and its source instance in Fig. 5.1 . To simplify

the discussion, in the following we drop the target egd constraints as they are not

handled by most mapping systems during the schema mapping generation. Based on

the schemas and the correspondences in the scenario, a constraint-driven mapping

Search WWH ::

Custom Search

Home