Databases Reference
In-Depth Information
3.4
Schema Mapping as Constraint Discovery
The algorithm for managing the schema mapping problem as query discovery failed
to handle two important cases. The first, was the complex nesting schema situations,
and the second was the management of unspecified attributes, i.e., attributes in the
target schema for which there is no correspondence to specify their value, yet, the
target schema specification either does not permit a null value, or even if it does, its
use will lead to loss of information. Furthermore, it became clear that the schema
information in conjunction will the correspondences could not always lead into a full
specification of the target instance, but only into a constraint relationship between
the source and the target instance. Thus, the notion of a mapping stopped being
the actual transformation script and became this notion of inter-schema constraint,
expressed as a tgd. This is a more natural view of the mapping problem since with
schemas being heterogeneous, it is natural to expect that not all the information
represented in the source can also exist in the target, and vice versa. Since a mapping
describes only the data that is to be exchanged between the schemas, the information
described by the mapping is a subset of the information described by the schemas.
Consider the example of Fig. 5.4 b. The situation is more or less the same as the
one on its left, with the small difference that the target schema has all the grant
information grouped and nested within the company in which the grant belongs.
Furthermore, the amount of the grand is not stored within the grand but separately in
the FinancialData structure. Note that the Grant structure has an attribute fdid used
by no correspondence, thus it could have remained null, if the target schema spec-
ification permits it. If not, a random value could have been generated to deal with
this restriction. Unfortunately, either of the two actions would break the relationship
of the funding information with its amount, since the attribute fdid is actually the
foreign key relationship that connects their respective structures.
To discover the intended meaning of the correspondences and generate the map-
pings, it is important to realize how the elements within a schema relate to each
other. This relationship will guide the combination of correspondences into groups
and the creation of the expected mappings. The idea for doing so comes from the
work on the universal relation [ Maier et al. 1984 ]. The universal relation provides
a single-relation view of the whole database in a way that the user does not have
to specify different tables and join paths. The construction of the universal relation
is based on the notion of logical access paths, or connections , as they were initially
introduced, and are groups of attributes connected either by being in the same table
or by following foreign key constraints [ Maier et al. 1984 ].
A generalized notion of a connection is that of the association [ Popa et al. 2002 ].
Intuitively, an association represents a set of elements in the same schema alongside
their relationships. An association is represented as a logical query whose head con-
sists of a relation with all the attributes mentioned in the body. For simplicity, the
head of the association is most of the time omitted. As an example, the following
logical query body:
A.x;y; z /; B. u ;v; w /; x
D
u
Search WWH ::




Custom Search