Databases Reference
In-Depth Information
• There are a large number of ways default or dummy dates have been
established, some of which can be quite confusing:
— Is 99365 (Dec 31, 1999) a dummy value, or a legitimate date?
— How about 010101 (Jan 1, 2001)? 999999?
— If your programs run across 000000 in the Year 2000, what action
will they take? What action should they take?
• The Year 2000 itself will cause many of the existing dates and formats
to become unacceptable, whereas up to now they may have worked
just fine.
— Two-digit years will need to be accompanied by two-digit century
values, or
— Some interim solution based (for example) on sliding date windows
will need to be adopted, or
— A business decision to accept the risks and potential consequences
of inaccurate processing will need to be made by the enterprise.
Special Case: Addresses
Addresses can be complicated to work with because they routinely com-
bine many of the issues we've described into just a few fields, with the
resulting combination being particularly challenging to unravel and/or to
correct. These problems can include:
4
• No common customer key across records and files
• Multiple names within one field
• One name across two fields
• Name and address in same field
• Personal and Commercial names mixed
• Different addresses for the same customer
• Different names and spellings for same customer
•“Noisy” name and address domains
• Inconsistent use of special characters
• Multiple formats within disparate data files
• Legacy data buried and floating within free-form fields
• Multiple account numbers blocking a consolidated view
• Complex matching and consolidation
• Data values that stray from their descriptions and business rules.
CONCLUSION
Knowing your enemy is the first step towards defeating him. In this case
the enemy is “bad” data, and this paper has illustrated several ways in
which poor-quality data can be present in your files. The task for you, the
reader, is to use the examples provided and apply them to your own spe-
cific systems and databases. Assess the severity of the situation, evaluate
the cost-effectiveness or desirability of taking corrective action, and pro-
ceed accordingly.
Search WWH ::




Custom Search