Databases Reference
In-Depth Information
smoker code.” That rule is not present in the application producing
File “A,” causing the discrepancy.
• Is there disagreement on Smith's age between the two files? No, it's
just that the dummy value for File “C” happens to be “99” rather than
“00.”
• Is File “C” missing more data for Frank than File “A”? Yes, but it's inten-
tional — the logic for File “C” refuses to store values for
Age
and
Smok-
) are completed.
• Note that these types of inconsistency can appear within a single file
just as easily as between files, if the business rules/application logic
changes over time.
er
unless all mandatory fields (including
Sex
Exhibit 27-3. Example of different logic.
Data File “A”
Identifier
Last Name
Sex
Age
Smoker
523
Smith
M
00
N
524
Jones
F
23
Y
526
Lee
M
42
527
Frank
17
Y
528
Yu
M
00
N
Data File “C”
Identifier
Last Name
Sex
Age
Smoker
523
Smith
M
99
N
524
Jones
F
22
N
525
Samuelson
M
54
Y
526
Lee
F
42
Y
527
Frank
Missing/Non-Unique Primary Key
In Exhibit 27-4, our hypothetical data warehouse draws information
from both Files “A” and “D”:
• Data File “D” does not have the “Identifier” field, and in fact does not
have a unique primary key. If we assume File “A” doesn't have the
First
field, then will you match record 523 with Fred Smith or with Sid
Smith? Which File “D” record will you match record 528 with?
• At first glance, record 526 might seem an easy match. But can we really
assume 536s Lee is the same person as File “D”s Tom Lee?
• The bottom line here is that the lack of a primary key can be a major
obstacle to accessing the data you need, even when the data is present
in the files themselves.
Name
 
Search WWH ::




Custom Search