Discovery and Correctness of Schema Mapping Transformations - Schema Matching and Mapping

Databases Reference

In-Depth Information

3.4

Schema Mapping as Constraint Discovery

The algorithm for managing the schema mapping problem as query discovery failed

to handle two important cases. The first, was the complex nesting schema situations,

and the second was the management of unspecified attributes, i.e., attributes in the

target schema for which there is no correspondence to specify their value, yet, the

target schema specification either does not permit a null value, or even if it does, its

use will lead to loss of information. Furthermore, it became clear that the schema

information in conjunction will the correspondences could not always lead into a full

specification of the target instance, but only into a constraint relationship between

the source and the target instance. Thus, the notion of a mapping stopped being

the actual transformation script and became this notion of inter-schema constraint,

expressed as a tgd. This is a more natural view of the mapping problem since with

schemas being heterogeneous, it is natural to expect that not all the information

represented in the source can also exist in the target, and vice versa. Since a mapping

describes only the data that is to be exchanged between the schemas, the information

described by the mapping is a subset of the information described by the schemas.

Consider the example of Fig. 5.4 b. The situation is more or less the same as the

one on its left, with the small difference that the target schema has all the grant

information grouped and nested within the company in which the grant belongs.

Furthermore, the amount of the grand is not stored within the grand but separately in

the FinancialData structure. Note that the Grant structure has an attribute fdid used

by no correspondence, thus it could have remained null, if the target schema spec-

ification permits it. If not, a random value could have been generated to deal with

this restriction. Unfortunately, either of the two actions would break the relationship

of the funding information with its amount, since the attribute fdid is actually the

foreign key relationship that connects their respective structures.

To discover the intended meaning of the correspondences and generate the map-

pings, it is important to realize how the elements within a schema relate to each

other. This relationship will guide the combination of correspondences into groups

and the creation of the expected mappings. The idea for doing so comes from the

work on the universal relation [ Maier et al. 1984 ]. The universal relation provides

a single-relation view of the whole database in a way that the user does not have

to specify different tables and join paths. The construction of the universal relation

is based on the notion of logical access paths, or connections , as they were initially

introduced, and are groups of attributes connected either by being in the same table

or by following foreign key constraints [ Maier et al. 1984 ].

A generalized notion of a connection is that of the association [ Popa et al. 2002 ].

Intuitively, an association represents a set of elements in the same schema alongside

their relationships. An association is represented as a logical query whose head con-

sists of a relation with all the attributes mentioned in the body. For simplicity, the

head of the association is most of the time omitted. As an example, the following

logical query body:

A.x;y; z /; B. u ;v; w /; x

D

u

Schema Matching and Mapping

Search WWH ::

Custom Search

Home