Primary Keys Are Nice but Not Essential - Database Design and Relational Theory

Databases Reference

In-Depth Information

2.

The discipline of using the same symbol to identify a given entity everywhere it's referenced allows the

system to recognize the fact that those references do all refer to the same thing.

This argument is clearly valid as far as it goes, but I now feel the discipline referred to should be treated as an

informal guideline rather than a hard and fast requirement. See the discussions later in this appendix─in particular,

the applicants and employees example─for examples of situations in which it might be desirable not to follow such

a guideline in practice. In any case, the guideline in question really has to do with design (in other words, with how

to apply the relational model in some specific situation), not with the relational model as such; in particular,

therefore, it has nothing to do with whether the relational model as such should insist on primary keys. I must have

been a little confused when I advanced this argument originally.

3.

“Metaqueries”─i.e., queries against the catalog─can be more difficult to formulate if entities are identified in

different ways in different places. For example, consider what's involved in formulating the metaquery

“Which relvars refer to employees?” if employees are referred to sometimes by employee number and

sometimes by social security number.

The idea here is basically that the discipline referred to under point 2 above can be beneficial for the user as

well as the system. Again, however, it seems to me that we're really talking about informal guidelines, not absolute

requirements.

4.

My next point wasn't exactly an argument for the PK:AK distinction, but rather a criticism of an argument

against it. This latter argument went as follows: Suppose some user is prevented, for security reasons, from

seeing some primary key; then that user needs access to the data by some alternate key instead; so why make

the PK:AK distinction in the first place?

I still don't find “this latter argument” very convincing, but of course criticizing an argument against some

position doesn't prove the contrary position is correct!

5.

My final point was an appeal to Occam's Razor (“concepts should not be multiplied beyond necessity”). In

effect, I was arguing that to treat all candidate keys as equals was to complicate the relational model's tuple

level addressing scheme unnecessarily. But it might well be argued (and now I would argue) that Occam's

Razor actually applies the other way around, and that it's the concepts of primary key and alternate key that

are unnecessary!─i.e., all we really need is candidate keys, or in other words just keys tout court .

In a nutshell, the foregoing arguments no longer seem to me very compelling; the only one that still appears

to have any validity is the one summarized under points 2 and 3 above, which (as I've said) isn't really an argument

for making the PK:AK distinction in the relational model as such, anyway. As I've also said, I now feel the position

supported by that particular argument should be seen more as a guideline than as an inviolable rule (again, see later

for examples to justify this position).

I note in passing, though, that I did hedge my bets somewhat in my original paper ... Here's another extract

(I've reworded it just slightly here):

Note that if we can agree on retaining the PK:AK distinction for now, there's always the possibility of eliminating that

distinction if desirable at some future time. And note moreover that this argument doesn't apply in the opposite

direction: Once we're committed to treating all candidate keys equally, a system that requires a distinguished primary

key will forever be nonstandard.

Search WWH ::

Custom Search

Home