Discovery and Correctness of Schema Mapping Transformations - Schema Matching and Mapping

Databases Reference

In-Depth Information

mapping formulas. As an example, the schema mapping from the example in the

right-hand side of Fig. 5.4 can be defined by means of nested s-t tgds. We omit the

quantifiers for the sake of readability (variables on the right that do not appear on

the left are existentially quantified), while the atomic variables are in lowercase and

the set variables start with uppercase, as follows:

m 0 1

: Public-Company .sn;ss/

Œ Company .sn;ti;Grant/

Œ Public-Grant . sg ; sa ; sn ; sr ; sm ; sa /

Contact . sm ; ph /

Contact . sa ; ph 2/

Grant .sg;sr;tf /

FinancialData .tf;sa;ph/:

The second mapping is exactly the same with the only difference that the last

variable in atom FinancialData is ph2 instead of ph .

Intuitively, whenever a tgd m 1 writes into a target relation R 1 and a tgd m 2 writes

into a relation R 2 nested into R 1 , it is possible to “correlate” the two mappings

by nesting m 2 into m 1 . The correlation among inner and outer mappings can be

observed by the variable sn both in Public-Company and in Public-Grant in the

example above. This rewritten mapping reduces the amount of redundant tuples in

the target, since the same data is not mapped twice in the generated target instance.

The same intuition applies if R 2 contains a foreign key pointing to relation R 1 .

Nested mappings are correlated in a rewriting step based on a nestable property for

a given pair of mappings. The property is verified with a syntactical check based

on the structures of the schemas involved in the mappings and the correspondences

between them. Once the property has been verified for all the mappings composing

a scenario, the nesting algorithm constructs a DAG, where a node is a mapping

having edges to other mappings for which it is nestable. The DAG represents all

the possible ways in which mappings can be nested under other mappings. The

algorithm identifies root mappings for the DAG (mappings that are not nestable), for

each root mapping traverses the DAG to identify a tree of mappings, and generates

a nested mapping for each tree rewriting the variables accordingly to the structure.

As nested mappings factor out common subexpressions, there are many bene-

fits in their use: (1) it is possible to produce more efficient translation queries by

reducing the number of passes over the source; (2) the generated data have less

redundancy as the same data are not mapped repeatedly by s-t tgds sharing common

parts.

Another attempt to reduce the redundancy generated by basic mappings has been

proposed by Cabibbo [ 2009 ]. The work introduced a revision of both the mapping

and the query generation algorithms. In the mapping generation phase, the pres-

ence of nullable attributes is considered to introduce an extended notion of logical

associations, a modified chase procedure to compute them, and novel pruning rules

used together with the subsumption and implication rules from Popa et al. [ 2002 ].

The query generation algorithm is also revised to ensure the satisfaction of target

key constraints or to unveil unsatisfiability of such keys. Moreover, when there are

key conflicts between groups of logical mappings with the same target relation, an

algorithm tries to resolve those conflicts by rewriting conflicting logical mappings

in queries with negations. Such interaction between mapping and query generation

Schema Matching and Mapping

Search WWH ::

Custom Search

Home