Database Reference
In-Depth Information
Gutierrez: How do you choose which projects to propose and/or join?
Hu: This is a question that I think about a lot, and I think any data scientist
wrestles with this problem. My theory on this is it is all about producing
something that your audience or customer will find useful or actually needs.
This means, in general, that I place much less focus on theoretical techniques.
I think that is just a necessary component of working at a startup or a fast-
paced environment. So in this role, I think mostly about what our customer or
our theoretical future customer wants and needs most immediately.
This involves working really closely with the product team and the customer
team in terms of figuring out what insights that they actually care about. We
track all our products on Trello. Our team puts together an overview of all
the data science questions that have come from customers over the years.
We are adding new questions to the board all of the time. The key to what
projects to suggest or choose is the prioritization of these questions. We are
always trying to surface the projects that our customers care about at that
moment in time.
Of course, something that we are always wrestling with is what is interest-
ing theoretically versus what is useful but more mundane. There are a lot of
theoretical things that I want to work on and that I find really fundamentally
interesting, but when our product or customer teams hear about it, it's like,
“Oh, God—why are you spending time on this? That's stupid.” Once we have a
conversation about it, I take that information and blend interesting theoretical
things with the needs of a customer.
Gutierrez: Take me through a recent project you worked on and what
insights you discovered.
Hu: One of the most interesting and impactful projects that I worked on in
the last year was essentially the prediction of up-and-coming artists. This is
something that we have wanted to do since I started, basically since the begin-
ning of the company. What really made it possible was the shaping of the data
and figuring out what we actually had. Of course, there had been all of the
projects that came before it, where we had tried different things. What was
different about this project was the aha! moment. The aha! moment came
when we realized that what we had been lacking in previous projects was
choosing the correct success metric.
In music, there are many ways you can define success, many different stages
in an artist's career, and many different milestones that you might think of as
important. But in order to really build a model and to make predictions, you
have to define what your output metric is that you care about. So one day we
were having discussions about how would we do this, and the success metric
that we ultimately came up with was the Billboard 200. If we define what we
care about in this particular question as the artists who are making hit albums
as defined by reaching the Billboard 200, it gives us a very useful metric.
 
Search WWH ::




Custom Search