Databases Reference
In-Depth Information
that practicing data science is inherently a collective endeavor. In the
beginning of the course, Rachel showed us a hub-and-spoke network
diagram. She had brought us all together and so was at the center. The
spokes connected each of us to her. It became her hope that new
friendships/ideas/projects/connections would form during the
course.
It's perhaps more important in an emergent field than in any other to
be part of a community. For data science in particular, it's not just useful
to your career—it's essential to your practice. If you don't read the
blogs, or follow people on Twitter, or attend meetups, how can you
find out about the latest distributed computing software, or a refuta‐
tion of the statistical approach of a prominent article? The community
is so tight-knit that when Cathy was speaking about MapReduce at a
meetup in April, she was able to refer a question to an audience mem‐
ber, Nick Avteniev—easy and immediate references to the experts of
the field is the norm. Data science's body of knowledge is changing
and distributed, to the extent that the only way of finding out what
you should know is by looking at what other people know. Having a
bunch of different lecturers kickstarted this process for us. All of them
answered our questions. All gave us their email addresses. Some even
gave us jobs.
Having listened to and conversed with these experts, we formed more
questions. How can we create a time series object in R? Why do we
keep getting errors in our plotting of our confusion matrix? What the
heck is a random forest? Here, we not only looked to our fellow stu‐
dents for answers, but we went to online social communities such as
Stack Overflow, Google Groups, and R bloggers. It turns out that there
is a rich support community out there for budding data scientists like
us trying to make our code run. And we weren't just getting answers
from others who had run into the same problems before us. No, these
questions were being answered by the pioneers of the methods. People
like Hadley Wickham, Wes McKinney, and Mike Bostock were pro‐
viding support for the packages they themselves wrote. Amazing.
Your Mileage May Vary
It's not as if there's some platonic repository of perfect data science
knowledge that you can absorb by osmosis. There are various good
practices from various disciplines, and different vocabularies and in‐
terpretations for the same method (is the regularization parameter a
Search WWH ::




Custom Search