Databases Reference
In-Depth Information
2
Preliminaries
A
schema
R
is a finite sequence hR
1
;:::;R
k
i of relation symbols, where each R
i
has a fixed arity. An
instance
I over
R
is a sequence .R
1
;:::;R
k
/, where each R
i
is a finite relation of the same arity as R
i
. We shall often use R
i
to denote both the
relation symbol and the relation R
i
that instantiates it. We assume that we have a
countably infinite set
Const
of
constants
and a countably infinite set
Var
of
labeled
nulls
that is disjoint from
Const
.A
fact
of an instance I (over
R
) is an expression
R
i
.v
1
;:::;v
m
/ (or simply R
i
.v
1
;:::;v
m
/), where R
i
is a relation symbol of
R
are constants or labeled nulls such that .v
1
;:::;v
m
/ 2 R
i
and v
1
;:::;v
m
.The
expression .v
1
;:::;v
m
/ is also sometimes referred to as a
tuple
of R
i
. An instance
is often identified with its set of facts.
A
ground
instance over some schema is an instance such that all values occurring
in its relations are constants. In general, however, instances over a schema may have
individual values from
Const
[
Var
; thus, some of the values in the instances may
be nulls representing unknown information. Such (non-ground) instances naturally
arise in data integration, data exchange and also schema evolution. We will see
examples of instances with nulls all throughout this paper.
Next, we define the concepts of
homomorphism
and
homomorphic equivalence
,
which we use frequently throughout this paper. Let I
1
and I
2
be instances over a
schema
R
. A function h from
Const
[
Var
to
Const
[
Var
is a
homomorphism
from
I
1
to I
2
if for every c in
Const
,wehavethath.c/ D c, and for every relation symbol
R in
R
and every tuple .a
1
;:::;a
n
/ 2 R
I
1
,wehavethat.h.a
1
/;:::;h.a
n
// 2
R
I
2
. We use the notation I
1
! I
2
to denote that there is a homomorphism from I
1
to I
2
. We say that I
1
is
homomorphically equivalent
to I
2
if I
1
! I
2
and I
2
! I
1
,
and we write this as I
1
$ I
2
.
Schema mappings:
A
schema mapping
is a triple
M
D .
S
;
T
;˙/,where
S
is a
source schema,
T
is a target schema, and ˙ is a set of constraints (typically, formu-
las in some logic) that describe the relationship between
S
and
T
. We say that
M
is syntactically
specified by
,or,
expressed by
˙.Furthermore,
M
is semantically
identified with the binary relation:
Inst.
M
/ Df.I;J/ j I is an
S
-instance, J is a
T
-instance, .I;J/ ˆ ˙g:
We will use the notation .I;J/ 2
M
to denote that the ordered pair .I;J/ satisfies
the constraints of
; furthermore, we will sometimes define schema mappings by
simply defining the set of ordered pairs .I;J/ that constitute
M
M
(instead of giving
a set of constraints that specify
M
). If .I;J/ 2
M
, we say that J is a
solution
of I
(with respect to
).
In general, the constraints in ˙ are formulas in some logical formalism. In
this chapter, we will focus on schema mappings specified by source-to-target
tuple-generating dependencies.
An
atom
is an expression of the form R.x
1
;:::;x
n
/,whereR is a relation sym-
bol and x
1
;:::;x
n
M
are variables that are not necessarily distinct. A source-to-target