Databases Reference
In-Depth Information
As in the previous examples, the common factor−2 can be dropped. We solve
the above equation for x, and get
j v sj
k=s u rk v kj
m rj
x =
j v sj
There is an analogous formula for the optimum value of an element of V . If
we want to vary v rs = y, then the value of y that minimizes the RMSE is
k=r u ik v ks
i u ir
m is
y =
i u ir
i
Here,
is shorthand for the sum over all i such that m is is nonblank, and
is the sum over all values of k between 1 and d, except for k = r.
k=r
9.4.5 Building a Complete UV-Decomposition Algorithm
Now, we have the tools to search for the global optimum decomposition of a
utility matrix M . There are four areas where we shall discuss the options.
1. Preprocessing of the matrix M .
2. Initializing U and V .
3. Ordering the optimization of the elements of U and V .
4. Ending the attempt at optimization.
Preprocessing
Because the differences in the quality of items and the rating scales of users are
such important factors in determining the missing elements of the matrix M , it
is often useful to remove these influences before doing anything else. The idea
was introduced in Section 9.3.1. We can subtract from each nonblank element
m ij the average rating of user i. Then, the resulting matrix can be modified
by subtracting the average rating (in the modified matrix) of item j. It is also
possible to first subtract the average rating of item j and then subtract the
average rating of user i in the modified matrix. The results one obtains from
doing things in these two different orders need not be the same, but will tend
to be close. A third option is to normalize by subtracting from m ij the average
of the average rating of user i and item j, that is, subtracting one half the sum
of the user average and the item average.
If we choose to normalize M , then when we make predictions, we need to
undo the normalization. That is, if whatever prediction method we use results
in estimate e for an element m ij of the normalized matrix, then the value
we predict for m ij in the true utility matrix is e plus whatever amount was
subtracted from row i and from column j during the normalization process.
Search WWH ::




Custom Search