Databases Reference
In-Depth Information
generate plausible interpretations to produce a precise and faithful representation
of the transformation, i.e., the mappings. For instance, in the schema mapping sce-
nario of Fig. 5.4 a, consider only the correspondence v 1 . One possible mapping that
this correspondence alone describes is that for each Public-Company in the source
instance, there should be in the target instance a Company with the same name .
Based on a similar reasoning for correspondence v 2 ,forevery Public-Grant with
identifier gid in the source instance, it is expected that there should be a Company
tuple in the target instance with that grant identifier as attribute fid . By noticing that a
Public-Grant is related to a Public-Company through the foreign key on attribute
company , one can easily realized that a more natural interpretation of these two
correspondences is that every public grant identifier found in a target schema tuple
of table Company should have as an associated company name the name of the
respective public company that the public grant is associated in the source. Yet, it is
not clear, whether public companies with no associated grants should appear in the
target table Company with a null fid attribute, or should not appear at all. Further-
more, note that the target schema relation has an attribute phone that is populated
from the homonym attribute from the source. This value should not be random but
somehow related to the company and the grant. However, note that the Contact
table in which the phone is located is related to the grant information through two
different join paths, i.e., one on the manager and one on the assistant. The informa-
tion provided by the correspondence on the phone is not enough to specify whether
the target should be populated with the phone of the manager or the phone of the
assistant.
The challenging task of interpreting the ambiguous correspondences gave raise
to the schema mapping problem as it has been introduced in Sect. 2 .
3.3
Schema Mapping as Query Discovery
One of the first mapping tools to systematically study the schema mapping problem
was Clio [ Miller et al. 2000 ], a tool developed by IBM. The initial algorithm of the
tool considers each target schema relation independently. For each relation R i ,it
creates a set V R i of all the correspondences that are on a target schema element that
belongs to the relation R i . Naturally, all these sets are mutually disjoint. For each
such set, a query Q
V R i will be constructed to populate the relation R i . The latter
query is constructed as follows. The set V R i of correspondences is further divided
into maximal subsets such that each such maximal subset M V R i
k
contains at most
one correspondence for each attribute of the respective target schema relation. For
all the source schema elements used by the correspondences in each such subset,
the possible join paths connecting them are discovered, and combined to form the
union of join queries. These queries are then combined together through an outer
union operator to form the query Q
V R i .
Search WWH ::




Custom Search