Databases Reference

In-Depth Information

1.16

See Chapter 7.

1.17

See Chapter 11.

1.18

See Chapter 13, also Appendix B.

CHAPTER 2

2.1
The Information Principle
is a fundamental principle that underpins the entire relational model. It can be

stated as follows:

Definition:
The Information Principle
states that the only kind of variable allowed in a relational database

is the relation variable or relvar. Equivalently, the entire information content of the database at any given

time is represented in one and only one way─namely, as values in attribute positions in tuples in relations.

Note that SQL tables (at least, SQL tables in the database) that involve left to right column ordering, or

contain duplicate rows or nulls, all violate
The Information Principle
(see the answer to the next exercise).

Interestingly, however, SQL tables with anonymous columns or columns with nonunique names apparently don't

violate the principle. The reason is that the principle as stated applies explicitly to relvars or relations
in the

database
. And while SQL tables in general can have anonymous columns or columns with nonunique names, such

tables can't be part of the database as such. This state of affairs suggests rather strongly that
The Information

Principle
could do with a little tightening up.

2.2 a. True. b. True. c. True. d. True. e. True. f. True. g. True. h. False. (However, it's “almost” true; there

are two small exceptions, both of which I'll simplify just slightly for present purposes. The first is that if relation
r

is of type
T
, then no attribute of
r
can itself be of type
T
. The second is that no relation in the database can have an

attribute of any pointer type.) i. True.
Subsidiary exercise:
Would any of these answers change if the original

statements are were framed in terms of SQL tables instead of relations and relvars? (
Answer:
Yes, they would all

change except for a. and h. In the case of h., moreover, the answer ought really to change too, from “False” to

“False, but even more so.” One reason for this state of affairs─not the only one─is that SQL has no proper notion of

table type, and SQL columns thus can't possibly be of such a type a fortiori.)

2.3 a. True. b. True. c. True.
Note:
Perhaps I should state for the record here that throughout this topic, in

accordance with normal practice, I take expressions of the form “
B
is a subset of
A
” to include the possibility that
B

and
A
might be equal. Thus, e.g., every heading is a subset of itself, and so is every body, and so is every tuple.

When I want to exclude such a possibility, I'll talk explicitly in terms of
proper
subsets. For example, the body of

our usual suppliers relation is certainly a subset of itself, but not a proper subset of itself (
no
set is a proper subset of

itself). What's more, the foregoing remarks apply equally to supersets, mutatis mutandis; for example, the body of

our usual suppliers relation is a superset of itself, but not a proper superset of itself.
More terminology:
A set is said

to
include
its subsets. Incidentally, don't confuse inclusion with
containment
─a set
includes
its subsets but
contains

its elements.

2.4 The reason the term isn't mentioned in the body of the chapter is that it's just a synonym for
type
. (Early

relational writings, my own included, tended to use it, but more recent ones use
type
instead, since it's shorter and

has a more extensive pedigree anyway, at least in the computing world.) Thus, a domain is a named, finite set of

values─all possible values of some specific kind: for example, all possible integers, or all possible character strings,

or all possible triangles, or all possible XML documents, or all possible relations with a specific heading (etc., etc.).