Primary Keys Are Nice but Not Essential - Database Design and Relational Theory

Databases Reference

In-Depth Information

requires primary keys─in fact, keys in general─to be irreducible, meaning they mustn't contain any attributes

that are irrelevant for unique identification purposes (and there are good reasons for that requirement, too, as

we know from Chapter 4). In other words, {INVNO,SHIPNO} just isn't a key (and so it certainly can't be

the primary key) for either relvar, and we'd be lying if we told the system otherwise. Indeed, if

{INVNO,SHIPNO} were truly a key, then the relationship between invoices and shipments would be many

to many, which it isn't. End of aside .

Precisely because constraint C holds, the design of Fig. A.1 clearly involves some redundancy: Every pair of

{INVNO,SHIPNO} values appearing in either relvar also necessarily appears in the other. Now, we could avoid

that redundancy by combining the two relvars into one:

INV_SHIP { INVNO , SHIPNO , INV_DETAILS , SHIP_DETAILS }

PRIMARY KEY { INVNO }

ALTERNATE KEY { SHIPNO }

By eliminating the redundancy in this way, we've also eliminated the need to state and enforce constraint C .

Furthermore, we could now define the original INVOICE and SHIPMENT relvars as views─specifically, projection

views─of INV_SHIP, thus allowing the user still to regard invoices and shipments as distinct entities. 9 This revised

design thus does enjoy certain advantages over the “obvious” version.

On the other hand, there are some disadvantages, too. Observe first that we've had to make an asymmetric

decision once again, choosing {INVNO} over {SHIPNO}─arbitrarily─as the primary key for relvar INV_SHIP.

Second, suppose further that shipments have certain subsidiary information that invoices don't; e.g., suppose

shipments are containerized, each shipment involving several containers. Then a new CONTAINER relvar is

needed:

CONTAINER { CONTNO , SHIPNO , ... }

PRIMARY KEY { CONTNO }

FOREIGN KEY { SHIPNO } REFERENCES INV_SHIP { SHIPNO }

And now we have a foreign key referencing an alternate key!─which is prohibited by the relational model as

originally defined, as we know.

Now, can we avoid this apparent violation of the prescriptions of the original model? Well, of course, the

answer is yes . There are various ways in which this might be done:

1.

We could go back to the two-relvar design (thereby reintroducing the data redundancy and the need for the

additional constraint, however).

2.

We could replace SHIPNO by INVNO in the CONTAINER relvar. However, this approach seems very

artificial (containers have nothing to do with invoices per se), and moreover introduces an unpleasant level of

indirection into the design (the shipment for a given container would be accessible only via the

corresponding invoice).

3.

We could leave the CONTAINER relvar as it is, but replace the foreign key specification by an explicit

declaration to the effect that every SHIPNO value in CONTAINER must also appear in INV_SHIP (using a

language like SQL or Tutorial D that permits the definition of arbitrarily complex constraints). But it does

9 There might be some difficulty over updating those views, of course, given the state of today's commercial products. But this is a separate

issue, beyond the scope of this appendix (and this topic).

Search WWH ::

Custom Search

Home