Databases Reference
In-Depth Information
better? Actually, who cares how well we perform on what is, essentially,
our training data?
This is not completely fair to Hastie and his coauthors. They would
probably argue that if students wanted to learn about data scraping
and organization, they should get a different topic that covers those
topics—the difference in the problems shows the stark contrast in
approach that this class took from normal academic introductory
courses. The philosophy that was repeatedly pushed on us was that
understanding the statistical tools of data science without the context
of the larger decisions and processes surrounding them strips them of
much of their meaning. Also, you can't just be told that real data is
messy and a pain to deal with, or that in the real world no one is going
to tell you exactly which regression model to use. These issues—and
the intuition gained from working through them—can only be un‐
derstood through experience.
Bridging Tunnels
As fledgling data scientists, we're not—with all due respect to Michael
Driscoll—exactly civil engineers. We don't necessarily have a grand
vision for what we're doing; there aren't always blueprints. Data sci‐
entists are adventurers, we know what we're questing for, we've some
tools in the toolkit, and maybe a map, and a couple of friends. When
we get to the castle, our princess might be elsewhere, but what matters
is that along the way we stomped a bunch of Goombas and ate a bunch
of mushrooms, and we're still spitting hot fire. If science is a series of
pipes, we're not plumbers. We're the freaking Mario Brothers.
Some of Our Work
The students improved on the original data science profile from back
in Chapter 1 in Figure 15-3 and created an infographic for the growing
popularity of data science in universities in Figure 15-4 , based on in‐
formation available to them at the end of 2012.
Search WWH ::




Custom Search