Database Reference
In-Depth Information
Perlich: When we interviewed Melinda, we looked at a project we had
recently worked on. We said, “We have to optimize Nielsen reports.” Nielsen
is one of the companies that provide feedback on advertising campaigns. For
instance, they may tell you that of all the ads that you showed, females saw
73 percent of the ads. The interesting part about this is Nielsen has some
internal panel. That panel does not cover all the people you showed ads to
but just some subset. Part of this panel is then matched against Facebook.
Then they figure out from this percentage that was on Facebook which ones
self-identified as being female. Whether or not they are female is a separate
question. But this is the basis of the report that tells you that females saw 73
percent of the ads. So the data is some subset of some subset and it is hard to
tell whether ultimately it is a representative sample of my ads. And now the
problem is that I am then supposed to optimize this without having any access
to any of the underlying data. But because it is not a predictive model, for no
instance/person do I get the answer. I only get aggregate feedback on sets of a
hundred thousand impressions.
This is a problem we had been working and thinking about recently. Internally,
we had brainstormed about it and had basically developed a methodology.
So when we interviewed Melinda, we asked her questions like: “How can you
optimize it?” and “How can you build a model to optimize for females, if this
is what you want.” This is not something we typically want, but we wanted to
hear her thought process. We said, “Tell us what to do about it. You have an
hour. Ask questions if you want to. This is a problem we are working on right
now.” It was quite interesting to have this conversation.
Gutierrez: How did Melinda approach the problem?
Perlich: Melinda went into probability theory, saying, “You have one group
that is 80 percent female. This other group is 70 percent female. The inter-
section: Should it be higher than 70 percent or should it be lower? Is the fact
that you show up in both of them increasing my belief that you are female, or
decreasing it?”
So we discussed how to go at this problem with the Bayesian theory of prob-
abilities—in particular, where it was possible to assume independence versus
overlapping, and so on. Ultimately this idea was not what we implemented
since the overlap was not sufficient. But we did take some of the ideas for-
ward and made it into a predictive modeling task: “Well, let's use it to ran-
domly label examples. Let's get a whole bunch of those that Nielsen thinks are
female. If they say it's 80 percent, then we will label these things as female with
80 percent probability.” We did this for all kinds of segments and then actually
built a model on it. So we faked the outcome, and built the model based on
probabilities, which is, in fact, what we ended up building.
Gutierrez: The interview is basically a working session then.
 
Search WWH ::




Custom Search