Database Reference
In-Depth Information
Summary
Database design and implementation is needed for three
reasons. Databases can be created (1) from existing data
(such as spreadsheets and databases tables), (2) for a new
systems development project, or (3) for a database redesign.
Database redesign is part of the system maintenance step of
the SDLC, and is necessary both to fix mistakes made during
the initial database design and also to adapt the database to
changes in system requirements. Such changes are common
because information systems and organizations do not just
influence each other—they create each other. Thus, new in-
formation systems cause changes in systems requirements.
Correlated subqueries and SQL EXISTS and NOT EXISTS
keywords are important tools. They can be used to answer
advanced queries. They also are useful during database rede-
sign for determining whether specified data conditions exist.
For example, they can be used to determine whether possible
functional dependencies exist in the data.
A correlated subquery appears deceptively similar to a
regular subquery. The difference is that a regular subquery
can be processed from the bottom up. In a regular subquery,
results from the lowest query can be determined and then
used to evaluate the upper-level queries. In contrast, in a
correlated subquery, the processing is nested; that is, a row
from an upper-level query statement is compared with rows
in a lower-level query. The key distinction of a correlated
subquery is that the lower-level SELECT statements use col-
umns from upper-level statements.
The SQL EXISTS and NOT EXISTS keywords create spe-
cialized forms of correlated subqueries. When these are used,
the upper-level query produces results, depending on the ex-
istence or nonexistence of rows in lower-level queries. An
EXISTS condition is true if any row in the subquery meets the
specified conditions; a NOT EXISTS condition is true only if all
rows in the subquery do not meet the specified condition. NOT
EXISTS is useful for queries that involve conditions that must
be true for all rows, such as a “customer who has purchased
all products.” The double use of NOT EXISTS is a famous SQL
pattern that often is used to test a person's knowledge of SQL.
Before redesigning a database, the existing database
needs to be carefully examined to avoid making the data-
base unusable by partially processing a database change.
The rule is to measure twice and cut once. Reverse engineer-
ing is used to create a data model of the existing database.
This is done to better understand the database structure
before proceeding with a change. The data model produced,
called a reverse engineered (RE) data model, is not a true
data model, but is a thing unto itself. Most data modeling
tools can perform reverse engineering. The RE data model
almost always has missing information; such models should
be carefully reviewed.
All of the elements of a database are interrelated.
Dependency graphs are used to portray the dependency of
one element on another. For example, a change in a table can
potentially affect relationships, views, indexes, triggers, stored
procedures, and application programs. These impacts need to
be known and accounted for before making database changes.
A complete backup must be made to the operational da-
tabase prior to any database redesign changes. Additionally,
such changes must be thoroughly tested, initially on small
test databases and later on larger test databases that may
even be duplicates of the operational databases. The rede-
sign changes are made only after such extensive testing has
been completed.
Database redesign changes can be grouped into differ-
ent types. One type involves changing table names and table
columns. Changing a table name has a surprising number of
potential consequences. A dependency graph should be used
to understand these consequences before proceeding with the
change. Nonkey columns are readily added and deleted. Adding
a NOT NULL column must be done in three steps: first, add the
column as NULL; then add data to every row; and then alter the
column constraint to NOT NULL. To drop a column used as a
foreign key, the foreign key constraint must first be dropped.
Column data types and constraints can be changed
using the ALTER TABLE ALTER COLUMN statement.
Changing the data type to Char or Varchar from a more spe-
cific type, such as Date, is usually not a problem. Changing a
data type from Char or Varchar to a more specific type can
be a problem. In some cases, data will be lost or the DBMS
may refuse the change.
Constraints can be added or dropped using the ADD
CONSTRAINT and DROP CONSTRAINT with the SQL
ALTER TABLE statement. Use of this statement is easier if the
developers have provided their own names for all constraints.
Changing minimum cardinalities on the parent side of
a relationship is simply a matter of altering the constraint
on the foreign key from NULL to NOT NULL or from NOT
NULL to NULL. Changing minimum cardinalities on the
child side of a relationship can be accomplished only by add-
ing or dropping triggers that enforce the constraint.
Changing maximum cardinality from 1:1 to 1:N is simple
if the foreign key resides in the correct table. In that case, just
remove the unique constraint on the foreign key column. If
the foreign key resides in the wrong table for this change,
move the foreign key to the other table and do not place a
unique constraint on that table.
Changing a 1:N relationship to an N:M relationship
requires building a new intersection table and moving the
primary key and foreign key values to the intersection table.
This aspect of the change is relatively simple. It is more
 
 
Search WWH ::




Custom Search