Database Reference
In-Depth Information
Let's assume that the company you work for stores all of its information in
spreadsheets. In a spreadsheet world, your data is only as good as the peo-
ple who record it.
What does that mean for data consistency? Suppose you store all your
customer information in a single workbook in your spreadsheet. You want
to know a few pieces of basic information about each customer: name, ad-
dress, phone number, and e-mail address. That seems easy enough, but
now let's introduce the human element into the scenario. Your customer
service employees are required to add information to the workbook for
each new customer they work with. Because your customer service reps
are human, how they record the information will vary from person to per-
son. For example, a rep may record the customer's information as shown
in row 1 of Table 1.1, and another may record the same customer's infor-
mation a different way, as shown in row 2 of Table 1.1.
Table 1.1 The Same Customer's Information as Entered
by Two Customer Service Reps
Name
Address
City
State
ZIP
Phone
Email
John Doe
123 Easy Street
SF
CA
94134
(415) 555-1956
jdoe@abcnetwork.com
J. Doe
123 Easy St.
San Fran
CA
94134
5551956
jdoe@abcnetwork.com
These are subtle differences to be sure, but if you look closely you'll see
some problems. First, if you want to run a report to count all of your San
Francisco-based customers, how would you go about it? Sure, a human can
tell that “SF” and “San Fran” are shorthand for San Francisco, but a com-
puter can't make that assumption without help. To run your report, you
would need to look for all the possible ways that someone could key in San
Francisco, to include all the ways it can be misspelled. Next, let's look at
the customer's name. For starters, are we sure it's the same person? “J.
Doe” could be Jane Doe or Javier Doe. Although the e-mail address is the
same on both records, I have seen my fair share of families with only one
shared e-mail address. Additionally, the second customer service repre-
sentative omitted the customer's area code, and that means you must
spend time looking it up if you ever need to call the customer.
For data to be useful, it must be consistent; I cannot stress this enough.
This means that when you store a piece of data, it is stored in the same way
each and every time. The city is always stored as San Francisco, and the
Search WWH ::




Custom Search