Databases Reference
In-Depth Information
guarantees redundancy in the data. This violation of entity integrity can
be quite serious.
To ensure entity integrity in the new system, the DCT will have to choose
which of the old records is to be accepted as the correct one to move into
the new system. It is helpful for audit routines to report on this fact. In ad-
dition, in the new system it will be necessary to devise a primary key,
which may not be available in the old data.
Uniqueness
In many cases, there are other fields that also should be unique and
serve as an alternate primary key. In some cases, even if there is primary
key integrity, there are redundancies in other alternative keys, which again
creates a problem for integrity in the new system.
Referential Integrity
The DCT should determine whether the data correctly reflects referen-
tial integrity constraints. In a relational system, tables are joined together
by primary key/foreign key links. The information to create this link may
not be available in the old data. If records from different files are to be
matched and joined, it should be determined whether the information ex-
ists to do the join correctly (i.e., a unique primary key and a foreign key).
Again, this problem needs to be addressed prior to conversion.
Domain Integrity
The domain for a field imposes constraints on the values that should be
found there. IS should determine if there are data domains that have been
coded into character or numeric fields in an undisciplined and inconsistent
fashion. It should further be determined whether there are numeric do-
mains that have been coded into character fields, perhaps with some non-
numeric values. There may be date fields that are just text strings, and the
dates may be in any order. A common problem is that date or numeric
fields stored as text may contain absurd values with the wrong data type
entirely.
Another determination that should be made is whether the domain-cod-
ing rules have changed over time and whether they have been recoded. It
is common for coded fields to contain codes that are no longer in use and
often codes that never were in use. Also, numeric fields may contain out-of-
range values. Composite domains could cause problems when trying to
separate them for storage in multiple fields. The boundaries for each sub-
item may not be in fixed columns.
There may be domains that incorrectly model internal hierarchy. This is
common in old-style systems and makes data modeling difficult. There
Search WWH ::




Custom Search