Primary Keys Are Nice but Not Essential - Database Design and Relational Theory

Databases Reference

In-Depth Information



Requires primary keys for base relvars;



Permits but does not require them for views and snapshots; and



Considers it “ completely unnecessary for primary keys to be declared or deduced” for any other relvars

(italics in the original).

These statements are paraphrased (but only slightly) from the paper in which Codd said there was no formal

basis for choosing the primary key (see earlier in this appendix). As a matter of fact, that paper goes so far as to

suggest that relvars other than base ones might not even possess a primary key, a suggestion that if true surely raises

serious questions about the concept in the first place─remember The Principle of Interchangeability (of base relvars

and views). Be that as it may, my position on these matters is rather different. To be specific, I would say:



First, every relvar, base or derived, does have at least one key (because, of course, no relation, and a fortiori

no relvar, ever permits duplicate tuples).



Second, every base relvar must have at least one key explicitly declared. Preferably, of course, all such keys

should be explicitly declared.



Often a base relvar will have an explicitly declared primary key in particular, but I don't insist on this state of

affairs as a hard requirement.



For reasons explained in detail in SQL and Relational Theory , I believe the system should be able to deduce

keys for derived relvars.



The previous point notwithstanding, I believe it should also be possible to declare keys for derived relvars

(for views and snapshots in particular). Again, see SQL and Relational Theory for further discussion.

THE INVOICES AND SHIPMENTS EXAMPLE

I now turn my attention to a more elaborate example. The example─which is based on a real world application,

incidentally─concerns invoices and shipments, and there's a one to one relationship between these two entity types:

Each shipment has exactly one invoice, each invoice has exactly one shipment. Here then is the “obvious” database

design (for the sake of the example, I use a hypothetical syntax that explicitly distinguishes between primary and

alternate keys): 6

INVOICE { INVNO , SHIPNO , INV_DETAILS }

PRIMARY KEY { INVNO }

ALTERNATE KEY { SHIPNO }

FOREIGN KEY { SHIPNO } REFERENCES SHIPMENT

6 One reviewer asked why a design consisting of three relvars (one each for invoices and shipments and one for the association between them)

wasn't the “obvious” design. Well, it's probably a better design, and it might be the obvious one. But that association relvar still has two keys

(INVNO and SHIPNO), and the major conclusion of the argument that follows─viz., that those two keys need to be treated as equals─still stands.

Search WWH ::

Custom Search

Home