Denormalization - Database Design and Relational Theory

Databases Reference

In-Depth Information

worry about normalization and denormalization, perhaps). Either prove the correctness of this claim or show that

it's incorrect by producing a counterexample.

8.2 The following is an excerpt from a published interview with a database consultant. 16 It begins with a

statement from the consultant:

Consultant: The problems ... largely result from normalizing data across multiple [relvars] ... Many queries, however,

are much easier to understand if the data is denormalized ...

Interviewer: Doesn't denormalization potentially lower data integrity and reduce flexibility in supporting unanticipated

queries?

Consultant: Normalization, and its emphasis on elimination of redundant storage, is purely a transaction processing

issue. When users view data, they see it in a redundant form. In order to transform data into a form that is useful to

users, it must be denormalized by means of a join, which is essentially a way of dynamically denormalizing data for

greater ease of use.

The problem is that users can't tolerate the time and cost of joins. To address the problem, companies replicate

data in an ever increasing number of decision support databases, which represent denormalized views of the data.

What in your opinion is wrong (if anything) with the opinions expressed?

8.3 The possibility of using surrogate identifiers or keys was mentioned in the body of the chapter. Indeed, many

designers recommend the use of such artifical or surrogate keys in place of what are sometimes called “natural”

keys. For example, we might add an attribute SPNO, say, to our usual shipments relvar (making sure it has the

uniqueness property, of course) and then make {SPNO} a surrogate key for that relvar. (Note, however, that

{SNO,PNO} would still be a key; it just wouldn't be the only one any longer.) Thus, surrogate keys are keys in the

usual relational sense, but (a) they always involve exactly one attribute and (b) their values serve solely as surrogates

for the entities they stand for (i.e., they serve merely to represent the fact that those entities exist—they carry

absolutely no additional meaning or baggage of any kind). Ideally, those surrogate values would be system

generated, but whether they're system or user generated has nothing to do with the basic idea of surrogate keys as

such. Two questions: Are surrogate keys the same thing as tuple IDs? And do you think they're a good idea?

8.4 If two designs are information equivalent, it must be possible to use operations of the relational algebra to

convert them into one another. Consider, therefore, the following competing designs for an employees relvar from

the body of the chapter:

EMP { ENO , JAN_PAY , FEB_PAY , ..., DEC_PAY }

KEY { ENO }

EMP { ENO , MONTH , PAY }

KEY { ENO , MONTH }

Using Tutorial D or SQL (or your own preferred database language), show how each of these designs can be

converted into the other.

16 It's from Data Base Newsletter 22 , No. 5 (September/October 1994).

Search WWH ::

Custom Search

Home