Databases Reference
In-Depth Information
a.
The result of the query is kept in the database under the specified name (TOTALS in the example) as a read-
only relvar (read-only, that is, apart from the periodic refresh─see point b. immediately following).
b.
Periodically (ON EVERY UPDATE in the example) the snapshot is refreshed ─that is, its current value is
discarded, the query is executed again, and the result of that new execution becomes the new snapshot value.
The general form of the REFRESH clause is
REFRESH EVERY <now and then>
where <now and then> might be, for example, MONTH or WEEK or DAY or HOUR or n MINUTES or
MONDAY or WEEKDAY (and so on). In particular, the specification REFRESH [ON] EVERY UPDATE means
the snapshot is kept permanently in synch with the relvar(s) from which it is derived─which is presumably just what
we want, in the case of Example 12.
Now, in this section so far I've concentrated on Example 12 and “derived data.” However, the fact is that all
forms of redundancy can be thought of as derived data: If x is redundant, then by definition x can be derived from
something else in the database. (Limiting use of the term derived data to the kind of situation illustrated by
Example 12 is thus misleading, and not recommended.) It follows that the foregoing analysis─in particular, the four
different approaches to dealing with derived data─can be generalized to apply to all kinds of redundancy, at least in
principle. Note in particular that the third and fourth of those approaches, using views and snapshots respectively,
both constitute examples of what's sometimes called controlled redundancy. Redundancy is said to be controlled if
it does exist (and the user is aware of it), but the task of “propagating updates” to ensure that it never leads to any
inconsistencies is managed by the system, not the user. Uncontrolled redundancy can be a problem, but controlled
redundancy shouldn't be. In fact, I want to go further─I want to say that while it's probably impossible, and maybe
not even desirable, to eliminate redundancy one hundred percent, any redundancy that isn't eliminated ought at least
to be controlled. In particular, we need support for snapshots. (Fortunately, many commercial products do now
support snapshots, albeit under the deprecated name materialized views .)
REFINING THE DEFINITION
I've deliberately left this section to the very end of the chapter (almost). Consider the shipments relvar SP, with its
predicate Supplier SNO supplies part PNO in quantity QTY . Consider also the relation shown as the value of that
relvar in Fig. 1.1 in Chapter 1. Observe that:
a.
Two of the tuples in that relation are (S1,P5,100) and (S1,P6,100).
b.
Both of those tuples include (S1,100) as a subtuple.
What do those two appearances of that subtuple mean? Well, the appearance in (S1,P5,100) means:
1.
Supplier S1 supplies some part in quantity 100.
(I've numbered this proposition─note that it is indeed a proposition─for purposes of future reference.) And the
appearance in (S1,P6,100) means exactly the same thing! So don't we have here a situation in which the database
contains two distinct appearances of some tuple that represent the very same proposition? In other words, in
accordance with the definition I gave in the introduction to this chapter, doesn't the database contain some
redundancy?
 
Search WWH ::




Custom Search