Databases Reference
In-Depth Information
m
1
copies company names and symbols in the
NYSE
source table to the
Company
table in the target. In doing this, the mapping requires that some value - represented
by the
I
existentially quantified variable - is assigned to the
id
attribute of the
Com-
pany
table. The
Public
source contains two relations with companies names and
grants that are assigned to them; these information are copied to the target tables by
mapping
m
2
; in this case, a value - again denoted by the
I
existentially quantified
variable - must be “invented” to correlate a tuple in
Grant
with the corresponding
tuple in
Company
. Finally, mappings
m
3
and
m
4
copy data in the
NSF
source tables
to the corresponding target tables; note that in this case we do not need to invent any
values.
The target tgd encode the foreign key on the target. The target egd simply states
that
symbol
is key for
Company
.
To formalize, given two schemas,
S
and
T
,an
e
m
be
dd
ed de
pe
nde
ncy
[
Beeri and
V
a r d i
19
84
] is a first-order formula o
f
the form
y. .x;y///
,where
x
and
y
are vector
s
of variables,
.x/
i
s
a
c
onjunction of atomic formulas such
tha
t
all varia
bl
e
s
in
x
appear in it, and
.x;y/
is a conjunction of atomic formulas.
.x/
and
.x;y/
may contain equations of the form
v
i
D
8
x..x/
!9
v
j
,where
v
i
and
v
j
are
variables.
An embedded dependency is a
tuple-generating dependency
if
.x/
and
.x;y/
onl
y co
ntain relational atoms. It is an
equality generating depe
nd
ency (egd)
if
.x;y/
conta
ins o
nly equations. A tgd is called a s
-t
tgds if
.x/
is a formula
over
S
and
.x;y/
over
T
.Itisa
target tgd
if both
.x/
and
.x;y/
are formulas
over
T
.
A
mapping scenario
(also called a
data exchange scenario
or a
schema mapping
)
is a quadruple
.
S
;
T
;˙
st
;˙
t
/
,where
S
is a source schema,
T
is a target
schema,
˙
st
is a set of s-t tgds, and
˙
t
is a set of target dependencies that may
contain tgds and egds. If the set of target dependencies
˙
t
M
D
is empty, we will use the
notation
.
S
;
T
;˙
st
/
.
Solutions.
We can now introduce the notion of a
solution
for a mapping scenario. To
do this, given two disjoint schemas,
S
and
T
, we shall denote by
h
S
;
T
i
the schema
f
S
1
:::
S
n
;
T
1
:::
T
m
g
.If
I
is an instance of
S
and
J
is an instance of
T
, then the pair
h
I;J
.
A target instance
J
is a
solution
of
i
is an instance of
h
S
;
T
i
M
and a source instance
I
(denoted
J
2
Sol.
M
;I/
)iff
h
I;J
iˆ
˙
st
[
˙
t
,i.e.,
I
and
J
together satisfy the dependencies.
.
S
;
T
;˙
st
;˙
t
/
, with s-t and target dependen-
cies, we find it useful to define a notion of a
pre-solution
for
Given a mapping scenario
M
D
M
and a source instance
I
as a solution over
I
for scenario
by remov-
ing target constraints. In essence, a pre-solution is a solution for the s-t tgds only,
and it does not necessarily enforce the target constraints.
Figure
5.3
shows several solutions for our example scenario on the source
instance in Fig.
5.1
. In particular, solution (a) is a pre-solution, since it satisfies
the s-t tgds but it does not comply with the key constraints and therefore it does
not satisfy the egds. Solution (b) is a solution for both the s-t tgds and the egds.
We want, however, to note that a given scenario may have multiple solutions on a
given source instance. This is a consequence of the fact that each tgd only states an
M
st
D
.
S
;
T
;˙
st
/
, obtained from
M