Database Reference
In-Depth Information
Here at the data science team at The New York Times, I'm building a group,
and I assure you that I spend as much time thinking hard about the place and
people as I do on things and ideas. Similarly, hackNY is all about mentoring. The
whole point of hackNY is to create a network of very talented young people
who believe in themselves and believe in each other and bring out the best in
themselves and bring out the best in each other. And certainly at Columbia, the
reason I'm still in academia is that I really value the teaching and mentoring and
the quest to better yourself and better your community that you get from an
in-person brick-and-mortar university as opposed to a MOOC.
Gutierrez: What does a typical day at work look like for you?
Wiggins: There are very few typical days right now, though I look forward to
having one in the future. I try to make my days at The New York Times typical
because this is a company. What I mean by that is that it is a place of interde-
pendent people, and so people rely on you. So I try throughout the day to make
sure I meet with everyone in my group in the morning, meet with everyone in
my group in the afternoon, and meet with stakeholders who have either data
issues or who I think have data issues but don't know it yet. Really, at this point, I
would say that at none of my three jobs is there such a thing as a “typical day.”
Gutierrez: Where do you get ideas for things to study or analyze?
Wiggins: Over the past 20 years, I would say the main driver of my ideas has
been seeing people doing it “wrong”. That is, I see people I respect working on
problems that I think are important, and I think they're not answering those
questions the right way. This is particularly true in my early career in machine
learning applied to biology, where I was looking at papers written by statistical
physicists who I respected greatly, but I didn't think that they were using, or
let's say stealing, the appropriate tools for answering the questions they had.
And to me, in the same way that Einstein stole Riemannian geometry from
Riemann and showed that it was the right tool for differential geometry, there
are many problems of interest to theoretical physicists where the right tools
are coming from applied computational statistics, and so they should use those
tools. So a lot of my ideas come from paying attention to communities that
I value, and not being able to brush it off when I see people whom I respect
who I think are not answering a question the right way.
Gutierrez: What specific tools or techniques do you use?
Wiggins: My group here at The New York Times uses only open source sta-
tistical software, so everything is either in R or Python, leaning heavily on
scikit-learn and occasionally IPython notebooks. We rely heavily on Git as
version control. I mostly tend to favor methods of supervised learning rather
than unsupervised learning, because usually when I do an act of clustering,
which is generically what one does as unsupervised learning, I never know if
I've done it the best. I always worry that there is some other clustering that I
could do, and I won't even know which of the two clusterings is the better.
 
Search WWH ::




Custom Search