Database Reference
In-Depth Information
language of ground conjunctive queries : that is, conjunctions of atoms R ( a ),where R is a
relation, and a is a tuple of constants. Then Q (
D
) represents the maximum knowledge
we can extract from Q (
D
) with respect to this language. This is the familiar notion of
However, if we use the more expressive language of conjunctive queries , we can extract
D
). For instance, suppose Q (
D
) consists of two instances:
{
R ( a , a ) , R ( b , c )
}
and
{
R ( a , a ) , R ( d , c )
}
.
Then the conjunctive query R ( a , a )
) than the inter-
section of the instances above, which is just R ( a , a ). This certain knowledge about Q (
∧∃
xR ( x , c ) tells us more about Q (
D
D
)
would be traditionally presented as a naıve table: that is, a table with variables (nulls). The
naıve table corresponding to R ( a , a )
∧∃
xR ( x , c ) is simply
{
R ( a , a ) , R ( x , c )
}
. Notice that the
class of databases that satisfy R ( a , a )
): every valuation of x
in { R ( a , a ) , R ( x , c ) } gives such a database. Nevertheless, conjunctive queries are not able
to extract any more knowledge about Q (
∧∃
xR ( x , c ) is not equal to Q (
D
D
), and thus narrow down the set of described
databases. The formula R ( a , a )
∧∃
xR ( x , c ) is as close to a definition of Q (
D
) as possible
within the language of conjunctive queries. Let us formalize these intuitions.
We define the notion of a max-description of a set
of databases that, in a given lan-
guage, expresses the information we can infer with certainty from
D
D
queries are then a special case of max-descriptions, applied to sets
{
Q ( D )
}
as D ranges
over a collection of databases.
To explain this notion, assume that
L
is a logical formalism in which we express prop-
erties of databases from
D
(e.g., conjunctive queries, or ground conjunctive queries such
as R ( a , a )). A set
Φ
of formulae of
L
defines a set of databases, called models of
Φ
and
denoted Mod(
Φ
), consisting of all databases satisfying each formula in
Φ
:
Mod(
Φ
)=
{
D
|
D
|
=
ϕ
for every
ϕ Φ }
.
To describe
D
fully in
L
we would need its
L
-definition: a finite set
Φ
of
L
-formulae
such that
). This is not always achievable, so instead we settle for the next
best thing, which is an
D
= Mod(
Φ
L
-definition of the set of models of certain knowledge about
D
expressed in
L
.
This certain
L
-knowledge of the class
D
, called
L
-theory of
D
, is the set of all formu-
lae from
L
satisfied in all databases from
D
:
Th L (
D
)=
{ ϕ ∈L |
D
|
=
ϕ
for every D
∈D}
.
The most precise (finite) description of
D
we can express in
L
is a finite definition of
Mod(Th L (
D
)), i.e., a finite set
Φ
of formulae of
L
such that
Mod(
Φ
)=Mod(Th L (
D
)) .
of XML trees. Since our setting always
includes a schema, we will be making the following assumptions about such sets
Let us now apply this general definition to sets
T
T
:
Search WWH ::

Custom Search