Databases Reference
In-Depth Information
• Even so, the gap doesn't represent simply a difference between
industry statistics and academic statistics. The general experience
of data scientists is that, at their job, they have access to a larger
body of knowledge and methodology , as well as a process, which
we now define as the data science process (details in Chapter 2 ),
that has foundations in both statistics and computer science.
Around all the hype, in other words, there is a ring of truth: this is
something new. But at the same time, it's a fragile, nascent idea at real
risk of being rejected prematurely. For one thing, it's being paraded
around as a magic bullet, raising unrealistic expectations that will
surely be disappointed.
Rachel gave herself the task of understanding the cultural phenom‐
enon of data science and how others were experiencing it. She started
meeting with people at Google, at startups and tech companies, and
at universities, mostly from within statistics departments.
From those meetings she started to form a clearer picture of the new
thing that's emerging. She ultimately decided to continue the investi‐
gation by giving a course at Columbia called “Introduction to Data
Science,” which Cathy covered on her blog. We figured that by the end
of the semester, we, and hopefully the students, would know what all
this actually meant. And now, with this topic, we hope to do the same
for many more people.
Why Now?
We have massive amounts of data about many aspects of our lives, and,
simultaneously, an abundance of inexpensive computing power.
Shopping, communicating, reading news, listening to music, search‐
ing for information, expressing our opinions—all this is being tracked
online, as most people know.
What people might not know is that the “datafication” of our offline
behavior has started as well, mirroring the online data collection rev‐
olution (more on this later). Put the two together, and there's a lot to
learn about our behavior and, by extension, who we are as a species.
It's not just Internet data, though—it's finance, the medical industry,
pharmaceuticals, bioinformatics, social welfare, government, educa‐
tion, retail, and the list goes on. There is a growing influence of data
in most sectors and most industries. In some cases, the amount of data
Search WWH ::




Custom Search