Databases Reference
In-Depth Information
contain duplicate elements, by definition─instead of “bags,” which do, as the right mathematical abstraction on
which to found a solid database theory. SQL apologists please note!
Aside: I observe in passing that now we have a precise characterization of the notion of “duplicate tuples.”
(People use this phrase all the time, and yet I very much doubt whether many of them would be able to define
it precisely if pressed.) Strictly speaking, of course, two tuples are duplicates if and only if they're the very
same tuple, just as two integers are duplicates if and only if they're the very same integer. The phrase
“duplicate tuples” thus doesn't really make much sense from a logical point of view (to say two distinct
tuples are duplicates is a contradiction in terms). What people are really talking about when they use that
phrase is duplicate appearances of the same tuple. For that reason, the phrase “duplicate elimination,” which
as we all know is often encountered in database contexts, would much better be duplication elimination. But
I digress … Let's get back to the main discussion. End of aside.
Second, we also don't want the same sub tuple to appear more than once in the same relvar (again, at the
same time). 6 But classical normalization takes care of this one; e.g., it's precisely because the FD {CITY} →
{STATUS} holds in relvar S, causing the same {CITY,STATUS} pair to occur repeatedly (with the same meaning
every time it appears), that we're recommended to replace that relvar by its projections on {CITY,STATUS} and
{SNO,SNAME,CITY}.
My next point is that the very same tuple can represent any number of distinct propositions, as can easily be
seen. As a trivial example, let SC and PC be the projection of relvar S on CITY and the projection of relvar P on
CITY, respectively. Given our usual sample values, then, a tuple containing just the CITY value “London” appears
in both SC and PC─but those two appearances represent distinct propositions. To be specific: The appearance in
SC represents the proposition There's at least one supplier in London , and the appearance in PC represents the
proposition There's at least one part in London (simplifying slightly in both cases for the sake of the example).
What's more─and here I have to get a little more formal on you for a moment─the same proposition can be
represented by any number of distinct tuples, too. That's because, formally, the pertinent attribute names are part of
the tuple (check the definition of tuple in Chapter 5 if you need confirmation of this point). Thus, for example, we
might have our usual shipments relvar SP with its attributes SNO, PNO, and QTY, and predicate:
Supplier SNO supplies part PNO in quantity QTY.
We might additionally have a relvar PS with attributes SNR, PNR, and AMT, with predicate:
Supplier SNR supplies part PNR in quantity AMT.
And then (to use Tutorial D syntax) the following tuples might appear in relvars SP and PS, respectively:
TUPLE { SNO 'S1' , PNO 'P1' , QTY 300 }
TUPLE { SNR 'S1' , PNR 'P1' , AMT 300 }
These are clearly different tuples, but they both represent the same proposition:
6 This statement too is hugely oversimplified. A slightly better one is: We don't want the same subtuple to appear more than once if distinct
appearances represent the same proposition─but this statement isn't perfect, either. To try to make it more precise still would take us further
afield than I wish to go at this point, however. See Chapter 15 for further explanation.
Search WWH ::




Custom Search