Databases Reference
In-Depth Information
These predicates aren't existentially quantified, and so the corresponding propositions aren't, either. 21
c.
Relvar S certainly contains subtuples corresponding to SN, SC, and CT; however, those corresponding to SN
and SC are never repeated because {SNO} is a key. By contrast, those corresponding to CT are repeated, at
least potentially (as we know from Fig. 1.1), and such repetition does constitute redundancy.
With all of the foregoing by way of motivation, then, I offer the following as a putative “final” definition of
what it means for a database to exhibit redundancy:
Definition ( “final” version ): Let D be a database design; let DB be a database value (i.e., a set of values for
the relvars mentioned in D ) that conforms to D ; and let p be a proposition not involving any existential
quantification. If DB contains two or more distinct representations of p (either implicitly or implicitly), then
DB contains, and D permits, redundancy .
Observe in particular that a database can still display redundancy by this definition, even if it fully conforms
to The Principle of Orthogonal Design and all normalization principles. Note, however, that the definition still says
if (not if and only if ) a certain condition holds, then there's redundancy; I'd like to replace that if by if and only if ,
but I don't quite have the courage of my convictions here. Not yet.
Be that as it may, let's consider Examples 1-12 from earlier sections of this chapter and see what the
implications are for those examples specifically. Please note carefully: For simplicity, I use the unqualified term
proposition throughout the following analysis to mean a proposition that's not existentially quantified.
Examples 1-2
Both of these examples display redundancy because the proposition City SFO and state CA are the city and state for
zip 94100 is represented twice.
Example 3
Suppose two distinct tuples both contain the DATE value “Tuesday, January 18th, 2011”; then the database clearly
displays redundancy because the proposition January 18th, 2011 is a Tuesday is represented twice, explicitly. In
fact, there's redundancy even if that DATE value appears just once! The reason is that even in that case, the
proposition January 18th, 2011 is a Tuesday is represented both explicitly and implicitly, as a result of the fact that
one part of the value, the day of the week (Tuesday, in the example), can be determined algorithmically from the rest
of the value (January 18th, 2011, in the example).
Example 4
Let employee E1 be represented in both relvar EMP and relvar PGMR. The corresponding propositions are E1 is an
employee and E1 is a programmer . The former proposition is a logical consequence of the combination of the latter
proposition together with the proposition All programmers are employees . (Note that this latter proposition is
represented, in effect, by the fact that {ENO} in PGMR is a foreign key referencing {ENO} in EMP.) Thus, the
proposition E1 is an employee is represented twice, once explicitly and once implicitly.
21 In connection with the lack of quantification in the predicate for CT in particular, you might want to take another look at the section
“Normalization Serves Two Purposes” in Chapter 3.
Search WWH ::

Custom Search