Databases Reference
In-Depth Information
So much for the preamble; now let's get down to specifics.
JOIN DEPENDENCIES─THE BASIC IDEA
Most of the time in this topic so far, I've been making a tacit assumption: namely, I've been assuming that when we
decompose some relvar, we always do so by replacing that relvar by exactly two of its projections. 3 (Note that
Heath's Theorem, which provides the formal underpinning for most of what I've had to say so far regarding nonloss
decomposition, specifically addresses decomposition into exactly two projections.) What's more, that assumption
was fully warranted, so long as our target was only BCNF─in other words, it successfully carried us as far as that
specific target. So you might be surprised to learn that there exist relvars that can't be nonloss decomposed into two
projections but can be nonloss decomposed into three (or maybe more than three).
As an aside, I note that, remarkably enough, Codd gave an example in 1969, in his very first paper on the
relational model (see Appendix C), that showed he was aware of this possibility. However, that example was
apparently overlooked by most of the paper's original readers; certainly it seems to have come as a surprise to the
research community when the possibility was rediscovered several years later (in 1977, to be precise).
Now, I said earlier, albeit loosely, that 5NF had to do with JDs that aren't implied by FDs. I can now add,
though again speaking very loosely, that it has to do with relvars that can't be nonloss decomposed into two
projections but can be nonloss decomposed into three or more. In other words, it's when these circumstances
arise─i.e., when there are JDs that aren't implied by FDs, and relvars that can only be nonloss decomposed into
more than two projections─that you do really have to come to grips with JDs and 5NF.
So what exactly do we mean when we say some JD holds in some relvar? Here's a definition:
Definition: Let X1 , ..., Xn be subsets of the heading H of relvar R ; then the join dependency (JD)
{ X1 , ... , Xn }
holds in R if and only if R can be nonloss decomposed into its projections on X1 , ..., Xn (i.e., if and only if
every legal value r of R is equal to the join of the corresponding projections r1 , ..., rn ). X1 , ..., Xn are the
components of the JD, and the JD overall can be read as “star X1 , ..., Xn ” or “join X1 , ..., Xn ”─though I
hasten to add that “join” really isn't the mot juste here, because the join operator (at least as usually
understood) joins relations , and X1 , ..., Xn aren't relations but headings.
By way of a simple example, consider the suppliers relvar S once again. As we know, that relvar is subject
to the FD {CITY} → {STATUS}, and so Heath's Theorem tells us it can be nonloss decomposed into its projections
on {SNO,SNAME,CITY} and {CITY,STATUS}. In other words, the following JD holds in that relvar:
{ { SNO , SNAME , CITY } , { CITY , STATUS } }
Points arising:
3 Of course, I'm referring here to an individual step in the overall process. Clearly, repeated steps─i.e., repeated individual decompositions─will
yield a result consisting of more than two projections, in general, even if each individual decomposition is into just two.
Search WWH ::

Custom Search