Databases Reference
In-Depth Information
talking about statistical models , which is what much of this topic is
about. One of Andrew Gelman's blog posts on modeling was recently
tweeted by people in the fashion industry, but that's a different issue.
Even if you've used the terms statistical model or mathematical model
for years, is it even clear to yourself and to the people you're talking to
what you mean? What makes a model a model ? Also, while we're asking
fundamental questions like this, what's the difference between a stat‐
istical model and a machine learning algorithm?
Before we dive deeply into that, let's add a bit of context with this
deliberately provocative Wired magazine piece, “The End of Theory:
The Data Deluge Makes the Scientific Method Obsolete,” published in
2008 by Chris Anderson, then editor-in-chief.
Anderson equates massive amounts of data to complete information
and argues no models are necessary and “correlation is enough”; e.g.,
that in the context of massive amounts of data, “they [Google] don't
have to settle for models at all.”
Really? We don't think so, and we don't think you'll think so either by
the end of the topic. But the sentiment is similar to the Cukier and
Mayer-Schoenberger article we just discussed about N=ALL, so you
might already be getting a sense of the profound confusion we're wit‐
nessing all around us.
To their credit, it's the press that's currently raising awareness of these
questions and issues, and someone has to do it. Even so, it's hard to
take when the opinion makers are people who don't actually work with
data. Think critically about whether you buy what Anderson is saying;
where you agree, disagree, or where you need more information to
form an opinion.
Given that this is how the popular press is currently describing and
influencing public perception of data science and modeling, it's in‐
cumbent upon us as data scientists to be aware of it and to chime in
with informed comments.
With that context, then, what do we mean when we say models ? And
how do we use them as data scientists? To get at these questions, let's
dive in.
What is a model?
Humans try to understand the world around them by representing it
in different ways. Architects capture attributes of buildings through
Search WWH ::




Custom Search