Database Reference
In-Depth Information
=
in
OP
h
q
E
∩
Th
OP
q
E
∩
Tf
OP
Tf
in
)
OP
Consequently, its information flux is
T
e
(f
in
;
in
∩
(
from the fact that
in
h
q
=
in
OP
h
q
E
=
Th
OP
in
h
q
=
q
E
=
h
q
E
and from Corollary
15
h
q
E
⊆
f
in
=
Tf
OP
=
h
q
E
=
=
β
1
and hence
T
e
(f
in
;
Tf
in
)
OP
)
(
from Example
26
)
in
=
β
1
.
Let us now show how the query-rewriting
algorithms
can be represented in
DB
category. First define the set of all queries over canonical model of the global
schema, which is represented in
DB
by the set of query-morphisms
S
can(
I
,D)
=
{
,D))
such that
h
q
∈
.
Then the query-expansion algorithm can be represented by the function
Exp
Q
:
S
can(
I
,D)
→
h
q
|
h
q
={
f,q
⊥
}∈
DB
(can(
I
,D),Tcan(
I
T
w
(can(
I
,D))
}
DB
(ret(
I
,D),Tret(
I
,D))
such that for each query-morphism over
global database
h
q
={
S
can(
I
,D)
we obtain an equivalent query-morphism
over retrieved database
Exp
Q
(h
q
)
f,q
⊥
}∈
={
f
E
,q
⊥
}∈
DB
(ret(
I
,D),Tret(
I
,D))
, with
im(f )
and hence
h
q
=
=
h
q
E
.
=
im(f
e
)
Exp
Q
(h
q
)
4.2.4 Fixpoint Operator for Finite Canonical Solution
The database instance
can(
G
=
(S
G
,Σ
G
)
can be an infinite database (see Example
27
bellow) and hence impos-
sible to materialize for the real applications. Thus, in this subsection, we introduce
a new approach to the canonical instance-database, closer to the data exchange ap-
proach [
4
]. It is not restricted to the existence of query-rewriting algorithms and
hence can be used in order to define a Coherent Closed World Assumption for data
integration systems also in the absence of query-rewriting algorithms [
12
].
The construction of the finite canonical instance-database for
I
,
D
)
which is a model of the global schema
that does not
satisfy all the integrity constraints of the logical theory for a data integration sys-
tem
G
)
described in [
2
]. The
difference
lies in the fact that in the construction of this
revisited canonical database for a global schema (which is not a model of the
global schema), denoted by
can
F
(
I
=
G
,
S
,
M
is similar to the construction of the canonical model
can(
I
,
D
I
,
D
)
, the fresh
marked null values
(from the
set
SK
of Skolem constants) are used instead of terms involving
Skolem's functions in the SOtgds obtained from the foreign key constraints in
Σ
tgd
={
ω
0
,ω
1
,...
}
G
,
following the idea of construction of the restricted chase of a database described
in [
8
]. Thus, for a given universe
SK
, we permit to use these Skolem
constants for primary keys as well, differently from standard relational databases
where we have only one NULL value that cannot be used for primary key attributes
of a relation. Here it is possible just because we have the marked null values
ω
i
,
so that we can use them also for the primary-key attributes because
ω
i
=
U
=
dom
∪
ω
j
for all
i
=
j
.