Database Reference
In-Depth Information
commute is contextually the same as your Thursday A.M. commute, and
Sunday lunch is always Sunday lunch.
We also have a very sophisticated ontology/taxonomy that we use internally.
All of our data and all of our categories of this data get mapped to this
ontology. So this framework that was built out is very sophisticated. It actu-
ally makes scaling much easier to do because you are not trying to boil the
whole ocean.
Gutierrez: So this is the background to the project.
Lenaghan: Correct: this was our location targeting. The big project I want
to talk about, which was important to the company, was what we call our
Audience product line. I briefly covered this earlier. The Audience product
line is our device targeting offering, as opposed to our location targeting.
When I came here, we started to think, “So we're targeting location, which is
great. Location histories are going to be even better.” And so this was taking
the ad-request logs and joining them with the geospatial layer that had already
been built.
Gutierrez: What was the first step in this project?
Lenaghan: We started by writing a query language that allowed us to create
profiles and audiences out of the ad-request logs joined with the geospatial
data layer. The first Audience we wanted to build was air travelers, which
meant we wanted to be able to look at all the location histories of devices
that had been observed in an airport. This was actually an enormous project.
It started off in fits, and there were a lot of things that did not scale so well.
We started off trying to build an Air Traveler audience by finding points in
polygons across the United States. As a first step, we started off by using the
polygons of airports. It is a very complicated computational geometry prob-
lem to find points in polygons mathematically [point-in-polygon problem].
There are fast ways to do it, but sort of generically. The canned ways you find
to do it are extremely slow. This approach just did not scale, it was really slow,
and it produced terrible results.
Gutierrez: How did you solve it?
Lenaghan: We solved it by tilizing our polygons. You still capture and map
data to these tiles. It's just that—especially for larger polygons, like Walmart
and airports and similar giant structure—the error that you have is small
once you tilize it. Once you work at the tile level, everything becomes kind of
abstract again. You have all these keys, and you are doing large key-value joints.
I wrote the first framework to do that work.
Once we had the audience, the next part of the project was figuring out the
demographics of that audience. You are able to make particular anonymized
inferences about the demographics according to where people happen to be.
 
Search WWH ::




Custom Search