Database Reference
In-Depth Information
Where Does It Come From?
“Where are you from?” is one of the first questions we ask when meeting a
new person. Historically, people were named after their birthplace (Joseph of
Arimathea, Robin of Loxley, and so on) or by their profession, which went a
long way toward understanding their background.
Data is no different. Understanding where data comes from means understand-
ing how the data was collected and how it was processed before it came into
your hands. It also means exploring the goals and motivations of the data
product author. Try asking the following questions:
1. What real life behavior does it reflect? Data is a lens, a perspective on
real life that is grounded in something tangible. Consider, for instance, the
fantasy football example mentioned in the first chapter. Fantasy football scor-
ing does not always directly correlate with actual game scoring, but it does
tie to common events on the field. The more your players are involved in the
action—catching passes, gaining yards—the more you can expect to score.
2. What are the strengths or weaknesses of the data sources? In examin-
ing any data product, a good place to start is to consider the source. Who
recorded the information? Consider the motivation behind those providing
the data product. Does the source have a financial incentive to provide accu-
rate information?
A client of ours, for instance, discovered that the office locations of a large
percentage of its national customers had been entered as the headquarters
of its own company. When the company began to investigate, it found that
those entering the data chose to set the address to the quickest thing that
came to mind. There weren't any incentives to enter the data accurately. The
Data Journalism Handbook notes:
The easiest way to show off with spectacular data is to fabricate it. It sounds
obvious, but data as commonly commented upon as GDP figures can very
well be phony. Former British ambassador Craig Murray reports in his topic,
Murder in Samarkand , that growth rates in Uzbekistan are subject to intense
negotiations between the local government and international bodies. In other
words, it has nothing to do with the local economy. 4
3. What information is emphasized? Though data may be objective, sub-
jective people author data products. There is often a message that has been
baked into the data presentation. The choice of metrics often indicates what is
important to the author, and the visual decisions can emphasize one conclusion
Search WWH ::




Custom Search