Databases Reference
In-Depth Information
collapsed into variable length and self-describing attributes. When they
leave the organization, their work can be your worst nightmare. It becomes
extremely difficult to clean up after their “legacy.”
THE DATA MIGRATION/TRANSFORMATION PROCESS
Preparing flat-plane databases for XML tagging requires a three step pro-
cess. Not a single step can be omitted, or the conversion is destined to cre-
ate more problems that it solves.
1. Analyze current data
2. Clean up current data
3. Transform the data
Analyze Current Data
Once you've decided to migrate your data form one structure to another,
the first step is to thoroughly analyze your existing data. This process
should especially focus on domain analysis, since it will help set the stage
for data type identification.
If you've already analyzed your data during the Y2K compliance effort, the
results of your analysis can be used again for the data transformation effort.
This process should have already included the thorough testing of the output.
Clean Up Current Data
The next step is to clean up any bad data revealed during analysis and
testing. In many cases, this involves straightforward corrections to field
values. What may sound easy at first, is complicated by values that inten-
tionally don't fit the format. Some of these exceptional values carry specific
meaning and are commonly referred to as embedded rules. An example
might be XXXX or 9999 to indicate “date unknown” in a field using YYMM
format. You may wish to preserve these special rules or replace them with
new ones. This can be done with such tools as Data Commander, which
analyzes, cleanses, and transforms pre-relational legacy mainframe data
and is available from Blackstone & Cullen of Atlanta, Georgia. Its EXCEP-
TION statements allows you to exempt specified fields from the general
conversion (or migration or transformation) process or to convert them in
a manner different from the rest of the data fields. Exhibit 1 illustrates some
of the processes that are a part of the entire migration effort.
The actual preparation and migration of legacy data from a pre-rela-
tional mainframe environment to a clean, consistent relational data store
happens in two major places: The host (or mainframe) location, where the
main “untangling” takes place and the data is cleaned, reformatted, and
scripted; and the server, where the web client resides. The data untangling
is best done on the host, as moving the data en masse to a server fight must
Search WWH ::




Custom Search