Database Reference
In-Depth Information
Gutierrez: What were you most proud of for this project?
Tunkelang: I'm most proud of the fact that although there were only three
of us doing most of the work for this project, the changes we made improved
the quality of a huge fraction of web search queries. It may sound cliché,
but it was great to work on something that benefits my mom's day-to-day
experience online.
Gutierrez: How did the three of you come together to work on this
project—was it one of Google's 20% projects?
Tunkelang: This was our day job and not a 20% project. We were assigned to
work as a team, and so we took complete ownership of the project. The previ-
ous developers were available for us to consult them, but mostly they were happy
to work on new projects while we improved on theirs. I truly have the utmost
respect for developers who understand that their products outlive them.
Gutierrez: How was the model for improving local business search built?
Tunkelang: Fortunately, we had a framework in place to compare the per-
formance of different machine learning approaches. We tried a bunch of them,
evaluating their accuracy against a golden set, as well as their efficiency, stabil-
ity, and interpretability. Ultimately, we opted to use decision trees.
We'd expected that switching from regression to decision trees would trade
off accuracy for interpretability and stability. But, to our pleasant surprise, we
were able to improve all three. And it was a lot easier to work on new model
features once we had a decision tree model in place.
Gutierrez: How are decisions made about replacing models that are already
in production?
Tunkelang: At both Google and LinkedIn, we make decisions based on
metrics. If a change affects only one metric—or if it affects several metrics
but all in the same direction—then the decision process is clear: we ship it.
The more interesting case is where some metrics go up and others go down.
In theory, we use a single utility measure to assess overall impact. In practice,
we negotiate whether the tradeoff is net positive to the business. The deci-
sions usually happen at the level of the teams that own the various metrics,
but in exceptional cases the tradeoffs get escalated to someone who can
arbitrate between competing business goals.
Gutierrez: What lessons did you learn from this project?
Tunkelang: I've always valued interpretability, but this project showed me how
crucial it could be in the context of machine learning. I also learned a lot about
the challenges of working with unrepresentative training data. While we had
large volumes of training data, we also had systematic biases that could trick
our machine learning models to overfit for those biases. We had to learn to
compensate for those biases and to distrust anything that looked too clever.
 
Search WWH ::




Custom Search