We Need More Science - Database Design and Relational Theory

Databases Reference

In-Depth Information

At the logical level, then, it's tempting just to say that redundancy is always bad. But of course this

statement is much too simplistic, owing to the availability of the view mechanism if nothing else. Let me digress for

a moment to elaborate on this latter point. It's well known, but worth stating explicitly nevertheless, that views (like

normalization, though for very different reasons) serve two rather different purposes:

1.

The user who actually defines view V is, obviously, aware of the expression X in terms of which V is defined.

That user can use the name V wherever the expression X is intended, but such uses are basically just

shorthand (like the use of macros in a programming language).

2.

By contrast, a user who's merely informed that view V exists and is available for use is supposed not to be

aware of the expression X ; to that user, in fact, V is supposed to look and feel just like a base relvar. 3

As an example of Case 1, suppose the user perceives the database as containing two relvars R1 and R2 and

goes on to define their join as a view; clearly, then, that view is redundant so far as that user is concerned, and it

could be dropped without any loss of information. For definiteness, therefore, I'm going to assume from this point

forward (barring explicit statements to the contrary) that no relvar in the database is defined in terms of any others,

so that at least this particular kind of redundancy isn't usually present. With this possibility ruled out, then, it's

tempting to set a stake in the ground and say again that redundancy at the logical level is always undesirable. In

order to adopt such a position, however, we need to be able to say what we mean by redundancy─for otherwise the

position can't possibly make sense. And even if we can come up with a good definition of the term, is the position

(that redundancy at the logical level is always bad, that is) really tenable? Is it possible to eliminate all redundancy?

Is it even desirable?

These are questions of considerable pragmatic importance, of course. Indeed, I think it's noteworthy that

Codd called his very first (1969) paper on the relational model “Derivability, Redundancy , and Consistency of

Relations Stored in Large Data Banks” (boldface added; see Appendix C). And his second (1970) paper, “A

Relational Model of Data for Large Shared Data Banks” (again, see Appendix C)─this is the one that's usually

regarded as the seminal paper in the field, though that characterization is a little unfair to its 1969 predecessor─was

in two parts of almost equal length, the second of which was called “Redundancy and Consistency” (the first was

called “Relational Model and Normal Form”). Codd thus clearly regarded his thoughts on redundancy as a major

part of the contribution of his relational work: rightly so, in my opinion, since he did at least provide us with a

framework in which we could begin to address the issue precisely and systematically.

Now, I showed in the previous chapter that one “definition” of redundancy that doesn't work is this: The

database involves redundancy if and only if it contains two distinct appearances of the same tuple. But we can

validly say the following:



Definition: The database involves redundancy if and only if it contains, directly or indirectly, two distinct

representations of the same proposition.

The trouble is, although this definition is clearly correct, it doesn't help much with the practical problem of

reducing redundancy. But it does at least imply the following, which is a little better:



Definition ( preliminary version ): Let D be a database design; let DB be a database value (i.e., a set of values

for the relvars mentioned in D ) that conforms to D ; and let p be a proposition. Further, let DB contain some

specific appearance of some tuple, or some combination of tuples, that represents p (either explicitly or

3 Emphasis on supposed ─I'm describing an ideal situation here. Today's reality is rather messier, as I'm sure you know.

Database Design and Relational Theory

Search WWH ::

Custom Search

Home