We Need More Science - Database Design and Relational Theory

Databases Reference

In-Depth Information

These predicates aren't existentially quantified, and so the corresponding propositions aren't, either. 21

c.

Relvar S certainly contains subtuples corresponding to SN, SC, and CT; however, those corresponding to SN

and SC are never repeated because {SNO} is a key. By contrast, those corresponding to CT are repeated, at

least potentially (as we know from Fig. 1.1), and such repetition does constitute redundancy.

With all of the foregoing by way of motivation, then, I offer the following as a putative “final” definition of

what it means for a database to exhibit redundancy:



Definition ( “final” version ): Let D be a database design; let DB be a database value (i.e., a set of values for

the relvars mentioned in D ) that conforms to D ; and let p be a proposition not involving any existential

quantification. If DB contains two or more distinct representations of p (either implicitly or implicitly), then

DB contains, and D permits, redundancy .

Observe in particular that a database can still display redundancy by this definition, even if it fully conforms

to The Principle of Orthogonal Design and all normalization principles. Note, however, that the definition still says

if (not if and only if ) a certain condition holds, then there's redundancy; I'd like to replace that if by if and only if ,

but I don't quite have the courage of my convictions here. Not yet.

Be that as it may, let's consider Examples 1-12 from earlier sections of this chapter and see what the

implications are for those examples specifically. Please note carefully: For simplicity, I use the unqualified term

proposition throughout the following analysis to mean a proposition that's not existentially quantified.

Examples 1-2

Both of these examples display redundancy because the proposition City SFO and state CA are the city and state for

zip 94100 is represented twice.

Example 3

Suppose two distinct tuples both contain the DATE value “Tuesday, January 18th, 2011”; then the database clearly

displays redundancy because the proposition January 18th, 2011 is a Tuesday is represented twice, explicitly. In

fact, there's redundancy even if that DATE value appears just once! The reason is that even in that case, the

proposition January 18th, 2011 is a Tuesday is represented both explicitly and implicitly, as a result of the fact that

one part of the value, the day of the week (Tuesday, in the example), can be determined algorithmically from the rest

of the value (January 18th, 2011, in the example).

Example 4

Let employee E1 be represented in both relvar EMP and relvar PGMR. The corresponding propositions are E1 is an

employee and E1 is a programmer . The former proposition is a logical consequence of the combination of the latter

proposition together with the proposition All programmers are employees . (Note that this latter proposition is

represented, in effect, by the fact that {ENO} in PGMR is a foreign key referencing {ENO} in EMP.) Thus, the

proposition E1 is an employee is represented twice, once explicitly and once implicitly.

21 In connection with the lack of quantification in the predicate for CT in particular, you might want to take another look at the section

“Normalization Serves Two Purposes” in Chapter 3.

Search WWH ::

Custom Search

Home