Database Reference
In-Depth Information
A more informational and harder-to-work-with body measure data set is the
review a person leaves after wearing the dress. Often, the customers write a
long exposé about the dress and how it it them. In these reviews, they offer
advice to other people about the dress and why it might or might not it them
based on the size of the dress they wore and their body. This type of data is
harder to use because we have to parse all that information out with natural
language processing to try to expose the relevant details.
Gutierrez: How can you ensure the accuracy of this very personal data?
Smith: A really simple thing we're doing is looking at the size someone says
she is and the size of dress she actually wore. Additionally, we have a question
that asks, “Is the dress true-to-fit?” Though the question is ambiguous, the
answers give a first-order approximation of whether the wearer found the
dress size large, small, or somewhere in between.
Another way we help people with body measurement-related data is an older
project we have called Our Runway, where we surface people's pictures that
they've worn in dresses, and then rank them by how similar their body is to
your body. Right now, putting people into buckets does that sorting, and I
think we can do a lot better. For instance, we could actually use some type of
distance metric between what you say your body is and what it actually is.
In terms of fit, we have someone try on the dresses when we first get them,
but it's someone who's a size 0, and so we can't really tell across all sizes
whether it runs large or runs small. She's only one body type, so it's hard to
scale out to all the other sizes and styles of dresses.
Gutierrez: I imagine that the dresses and designers themselves show some
variations that are hard to work with as well?
Smith: Yes, that's a huge issue as well. We've looked into and continue to
look at the fashion designer's actual size chart, and that's just really messy to
deal with, as oftentimes what they say their measurements are doesn't actually
align with their own dresses. So we've looked into measuring the dresses our-
selves and trying to see if that sizing gives us better information on the fit.
Then there's the issue of fabrics and how stretchy they are. If fabrics are
stretchy, then they are more forgiving and can fit many different people. If the
fabric isn't stretchy, then it's less forgiving and its less people. The different
fabrics make sizing an even more complicated problem. So it's a huge dynami-
cal problem that can be frustrating.
Gutierrez: What tools do you use to store data?
Smith: Here we don't have Hadoop. Here we use databases, like HP's Vertica.
We store in them the data we just talked about and also the pixel logs. The
pixel logs tell us what's going on on the website—like what people are clicking
on, their navigation paths, and other website-related things.
 
Search WWH ::




Custom Search