Database Reference
In-Depth Information
However, in-home and out-of-home locations are very different and will give
different demographic results. So in order to get this right, we had to build a
classifier for “what does it mean for a tile to be residential.”
Gutierrez: Sounds deceptively simple. There must have been more than a
few stumbling blocks. What were they?
Lenaghan: It does sound really easy. You look at the map and search for
a house. Once you see a house, you know the tile is residential, so you are
able to get demographic results. However, doing this across the one billion
tiles in the United States means that you have to do that programmatically
somehow. The power of the classifier comes from being able to designate a
tile as residential or nonresidential. So this was an important step to figure
out. Unfortunately, there is not a good data set that says, “This particular tile
is residential.”
Gutierrez: How did you develop the data set to tell you if a tile was
residential?
Lenaghan: We used a lot of different data sets, including a lot of ad-request
data, and tried a lot of different features to figure out where the residences
were. Again, sounds straightforward, but it was not straightforward at all. As
an example of why we had to use multiple data sets, the census data does not
work because the census data is defined in terms of census blocks, which are
enormous. So if you were to just use census data as your residential signal, you
would have a residential signal essentially everywhere in the United States.
Gutierrez: Tell me about the classifier you developed.
Lenaghan: The classifier we came up with had about sixteen features that
indicated whether or not the tile was residential. We then had to finish build-
ing out this very high-quality residential classifier. Once we had that, we could
figure out from all these location histories what demographic attributes to
give the Air Traveler audience.
Now we have these in-home and out-of-home components of the audience,
which give us a base data layer for building any sort of movement profile that
we would want. So we can now combine “a device that tends to be in house-
holds with this particular demographic” with “a device tends to dwell in coffee
shops and has been observed on an auto lot for a particular brand.”
Gutierrez: Is this where the query language comes in?
Lenaghan: Yes. Now that we have the data and the classifier, we then have
to build up the query language to help us create the types of audiences we
wanted. This means the query language has to be able to write these rules and
has to be able to hook into the geospatial base data layer to pull out these
audiences.
 
Search WWH ::




Custom Search