Database Redesign - Database Processing: Fundamentals, Design, and Implementation

Database Reference

In-Depth Information

Summary

Database design and implementation is needed for three

reasons. Databases can be created (1) from existing data

(such as spreadsheets and databases tables), (2) for a new

systems development project, or (3) for a database redesign.

Database redesign is part of the system maintenance step of

the SDLC, and is necessary both to fix mistakes made during

the initial database design and also to adapt the database to

changes in system requirements. Such changes are common

because information systems and organizations do not just

influence each other—they create each other. Thus, new in-

formation systems cause changes in systems requirements.

Correlated subqueries and SQL EXISTS and NOT EXISTS

keywords are important tools. They can be used to answer

advanced queries. They also are useful during database rede-

sign for determining whether specified data conditions exist.

For example, they can be used to determine whether possible

functional dependencies exist in the data.

A correlated subquery appears deceptively similar to a

regular subquery. The difference is that a regular subquery

can be processed from the bottom up. In a regular subquery,

results from the lowest query can be determined and then

used to evaluate the upper-level queries. In contrast, in a

correlated subquery, the processing is nested; that is, a row

from an upper-level query statement is compared with rows

in a lower-level query. The key distinction of a correlated

subquery is that the lower-level SELECT statements use col-

umns from upper-level statements.

The SQL EXISTS and NOT EXISTS keywords create spe-

cialized forms of correlated subqueries. When these are used,

the upper-level query produces results, depending on the ex-

istence or nonexistence of rows in lower-level queries. An

EXISTS condition is true if any row in the subquery meets the

specified conditions; a NOT EXISTS condition is true only if all

rows in the subquery do not meet the specified condition. NOT

EXISTS is useful for queries that involve conditions that must

be true for all rows, such as a “customer who has purchased

all products.” The double use of NOT EXISTS is a famous SQL

pattern that often is used to test a person's knowledge of SQL.

Before redesigning a database, the existing database

needs to be carefully examined to avoid making the data-

base unusable by partially processing a database change.

The rule is to measure twice and cut once. Reverse engineer-

ing is used to create a data model of the existing database.

This is done to better understand the database structure

before proceeding with a change. The data model produced,

called a reverse engineered (RE) data model, is not a true

data model, but is a thing unto itself. Most data modeling

tools can perform reverse engineering. The RE data model

almost always has missing information; such models should

be carefully reviewed.

All of the elements of a database are interrelated.

Dependency graphs are used to portray the dependency of

one element on another. For example, a change in a table can

potentially affect relationships, views, indexes, triggers, stored

procedures, and application programs. These impacts need to

be known and accounted for before making database changes.

A complete backup must be made to the operational da-

tabase prior to any database redesign changes. Additionally,

such changes must be thoroughly tested, initially on small

test databases and later on larger test databases that may

even be duplicates of the operational databases. The rede-

sign changes are made only after such extensive testing has

been completed.

Database redesign changes can be grouped into differ-

ent types. One type involves changing table names and table

columns. Changing a table name has a surprising number of

potential consequences. A dependency graph should be used

to understand these consequences before proceeding with the

change. Nonkey columns are readily added and deleted. Adding

a NOT NULL column must be done in three steps: first, add the

column as NULL; then add data to every row; and then alter the

column constraint to NOT NULL. To drop a column used as a

foreign key, the foreign key constraint must first be dropped.

Column data types and constraints can be changed

using the ALTER TABLE ALTER COLUMN statement.

Changing the data type to Char or Varchar from a more spe-

cific type, such as Date, is usually not a problem. Changing a

data type from Char or Varchar to a more specific type can

be a problem. In some cases, data will be lost or the DBMS

may refuse the change.

Constraints can be added or dropped using the ADD

CONSTRAINT and DROP CONSTRAINT with the SQL

ALTER TABLE statement. Use of this statement is easier if the

developers have provided their own names for all constraints.

Changing minimum cardinalities on the parent side of

a relationship is simply a matter of altering the constraint

on the foreign key from NULL to NOT NULL or from NOT

NULL to NULL. Changing minimum cardinalities on the

child side of a relationship can be accomplished only by add-

ing or dropping triggers that enforce the constraint.

Changing maximum cardinality from 1:1 to 1:N is simple

if the foreign key resides in the correct table. In that case, just

remove the unique constraint on the foreign key column. If

the foreign key resides in the wrong table for this change,

move the foreign key to the other table and do not place a

unique constraint on that table.

Changing a 1:N relationship to an N:M relationship

requires building a new intersection table and moving the

primary key and foreign key values to the intersection table.

This aspect of the change is relatively simple. It is more

Database Processing: Fundamentals, Design, and Implementation

Search WWH ::

Custom Search

Home