Recommendation Systems - Mining of Massive Datasets

Databases Reference

In-Depth Information

2.6

5.204

3.6

2.617

1.617

1.178

2.905

2.178

2.617

Figure 9.16: Replace z by 1.178

that M is an n-by-m utility matrix with some entries blank, while U and V

are matrices of dimensions n-by-d and d-by-m, for some d. We shall use m ij ,

u ij , and v ij for the entries in row i and column j of M , U , and V , respectively.

Also, let P = U V , and use p ij for the element in row i and column j of the

product matrix P .

Suppose we want to vary u rs and find the value of this element that mini-

mizes the RMSE between M and U V . Note that u rs affects only the elements

in row r of the product P = U V . Thus, we need only concern ourselves with

the elements

p rj =

u rk v kj =

u rk v kj + xv sj

k=1

k=s

for all values of j such that m rj is nonblank. In the expression above, we have

replaced u rs , the element we wish to vary, by a variable x, and we use the

convention

•

is shorthand for the sum for k = 1, 2, . . . , d, except for k = s.

k=s

If m rj is a nonblank entry of the matrix M , then the contribution of this

element to the sum of the squares of the errors is

m rj

−p rj ) 2

(m rj

−

u rk v kj

−xv sj

k=s

We shall use another convention:

•

is shorthand for the sum over all j such that m rj is nonblank.

Then we can write the sum of the squares of the errors that are affected by

the value of x = u rs as

m rj

−

u rk v kj

−xv sj

k=s

Take the derivative of the above with respect to x, and set it equal to 0, in

order to find the value of x that minimizes the RMSE. That is,

m rj

−2v sj

−

u rk v kj

−xv sj

= 0

k=s

Mining of Massive Datasets

Search WWH ::

Custom Search

Home