Database Design Using Normalization - Database Processing: Fundamentals, Design, and Implementation

Database Reference

In-Depth Information

You can reduce the complexity of the SQL required to read these data and also reduce

DBMS processing by joining the tables once and storing the joined result as a single table.

The following SQL statement will join the three tables together and store them in a new table

named STUDENT_ACTIVITY_PAYMENT_DATA:

/* *** SQL-INSERT-CH04-03 *** */

INSERT INTO STUDENT_ACTIVITY_PAYMENT_DATA

SELECT STUDENT.StudentID, StudentName,

ACTIVITY.Activity, ActivityFee,

AmountPaid

FROM STUDENT, PAYMENT, ACTIVITY

WHERE STUDENT.StudentID = PAYMENT.StudentID

AND PAYMENT.Activity = ACTIVITY.Activity;

As shown in Figure 4-6, the STUDENT_ACTIVITY_PAYMENT_DATA table that results from this

join has the same data as the original STUDENT_ACTIVITY table as shown in Figure 3-24.

As you can see, denormalization is simple. Just join the data together and store the joined

result as a table. By doing this when you place the data into the read-only database, you save the ap-

plication programmers from having to code joins for each application, and you also save the DBMS

from having to perform joins and subqueries every time the users run a query or create a report.

Customized Duplicated Tables

Because there is no danger of data integrity problems in a read-only database and because the

cost of storage today is miniscule, read-only databases are often designed with many copies of

the same data, each copy customized for a particular application.

For example, suppose a company has a large PRODUCT table with the columns listed in

Figure 4-7. The columns in this table are used by different business processes. Some are used

for purchasing, some are used for sales analysis, some are used for displaying parts on a Web

site, some are used for marketing, and some are used for inventory control.

The values of some of these columns, such as those for the picture images, are large. If

the DBMS is required to read all of these data for every query, processing is likely to be slow.

Accordingly, the organization might create several customized versions of this table for use by

different applications. In an updatable database, so much duplicated data would risk severe

data integrity problems, but for a read-only database, there is no such risk.

Suppose for this example that the organization designs the following tables:

PRODUCT_PURCHASING ( SKU , SKU_Description, VendorNumber,

VendorName, VendorContact_1, VendorContact_2, VendorStreet, VendorCity, VendorState,

VendorZIP)

PRODUCT_USAGE ( SKU , SKU_Description, QuantitySoldPastYear,

QuantitySoldPastQuarter, QuantitySoldPastMonth)

PRODUCT_WEB ( SKU , DetailPicture, ThumbnailPicture,

MarketingShortDescription, MarketingLongDescription, PartColor)

PRODUCT_INVENTORY ( SKU , PartNumber, SKU_Description, UnitsCode,

BinNumber, ProductionKeyCode)

You can create these tables using the graphical design facilities of Access or another DBMS.

Once the tables are created, they can be filled using INSERT commands similar to those al-

ready discussed. The only tricks are to watch for duplicated data and to use DISTINCT where

necessary. See Review Question 4.10.

Common Design Problems

Although normalization and denormalization are the primary considerations when designing

databases from existing data, there are four additional practical problems to consider. These

are summarized in Figure 4-8.

Database Processing: Fundamentals, Design, and Implementation

Search WWH ::

Custom Search

Home