Databases Reference
In-Depth Information
Chapter 15
W e N e e d M o r e S c i e n c e
What I tell you three times is true
─Lewis Carroll: The Hunting of the Snark
Note: Portions of this chapter originally appeared, in considerably different form, in my topic Date on Database:
Writings 2000-2006 (Apress, 2006).
Redundant adj. de trop, diffuse, excessive, extra, inessential, inordinate, padded, periphrastic, pleonastical,
prolix, repetitious, supererogatory, superfluous, supernumerary, surplus, tautological, unemployed,
unnecessary, unneeded, unwanted, verbose, wordy
I give the foregoing splendid list of synonyms here purely for whatever intrinsic interest it might have (I
found it in Chambers Twentieth Century Thesaurus , which also gives the following nice list of antonyms: concise,
essential, necessary). Be that as it may, we've seen that design theory in general can be regarded as a set of
principles and techniques for reducing redundancy (and thereby reducing the potential for certain inconsistencies
and update anomalies that might otherwise occur). But what exactly is redundancy? We don't seem to have a very
precise definition of the term; we just have a somewhat vague idea that it can lead to problems, at least if it isn't
managed properly. 1
In order to get a slightly better handle on this question, we first need to distinguish clearly between the
logical and physical levels of the system. Obviously the design goals are different at the two levels. At the physical
level, redundancy will almost certainly exist in some shape or form. Here are a couple of reasons why:
Indexes and other such “fast access path” structures necessarily entail some redundancy, because certain data
values are stored both in those auxiliary structures and in the structures to which they provide that “fast
access.”
Derived relvars (and/or derived relations) that are physically stored in some way─what are known variously
as snapshots or summary tables or materialized queries or materialized views 2 ─also obviously involve some
redundancy.
The reason for redundancy at the physical level is performance, of course. But physical redundancy has (or
should have!) no effect on the logical level; it's managed by the DBMS, and it's never directly seen by the user. I
mention it here only to get it out of the way, as it were. From this point forward, I'll be concerned only with
redundancy at the logical level.
1 I remind you, though, that we do at least have a precise definition of the kind of redundancy that can be removed by taking projections (see
Chapter 13).
2 This last term is strongly deprecated, by the way, because the construct in question isn't a view. Views are virtual, not materialized (at least as
far as the relational model is concerned), and materialized view is thus a contradiction in terms. The proper term is snapshot .
Search WWH ::




Custom Search