Recommendation Engines: Building a User-Facing Data Product at Scale - Doing Data Science

Databases Reference

In-Depth Information

Then the dot product u i · v j is the predicted preference of user i for item

j , and you want that to be as close as possible to the actual preference

x i , j .

So, you want to find the best choices of U and V that minimize the

squared differences between prediction and observation on every‐

thing you actually know, and the idea is that if it's really good on stuff

you know, it will also be good on stuff you're guessing. This should

sound familiar to you—it's mean squared error, like we used for linear

regression.

Now you get to choose a parameter, namely the number d defined as

how may latent features you want to use . The matrix U will have a row

for each user and a column for each latent feature, and the matrix V

will have a row for each item and a column for each latent feature.

How do you choose d ? It's typically about 100, because it's more than

20 (as we told you, through the course of developing the product, we

found that we had a pretty good grasp on someone if we ask them 20

questions) and it's as much as you care to add before it's computa‐

tionally too much work.

The resulting latent features are the basis of a well-defined

subspace of the total n -dimensional space of potential latent

variables. There's no reason to think this solution is unique

if there are a bunch of missing values in your “answer” matrix.

But that doesn't necessarily matter, because you're just look‐

ing for a solution .

Theorem: The resulting latent features will be uncorrelated

We already discussed that correlation was an issue with k-NN, and

who wants to have redundant information going into their model? So

a nice aspect of these latent features is that they're uncorrelated. Here's

a sketch of the proof:

Say we've found matrices U and V with a fixed product U · V = X such

that the squared error term is minimized. The next step is to find the

best U and V such that their entries are small—actually we're mini‐

mizing the sum of the squares of the entries of U and V . But we can

modify U with any invertible d × d matrix G as long as we modify V

by its inverse: U · V = U · G · G −1 · V

= X .

Search WWH ::

Custom Search

Home