Recommendation Systems - Mining of Massive Datasets

Databases Reference

In-Depth Information

Avoiding Overfitting

One problem that often arises when performing a UV-decomposition is that we

arrive at one of the many local minima that conform well to the given data, but

picks up values in the data that don't reflect well the underlying process that

gives rise to the data. That is, although the RMSE may be small on the given

data, it doesn't do well predicting future data. There are several things that can

be done to cope with this problem, which is called overfitting by statisticians.

1. Avoid favoring the first components to be optimized by only moving the

value of a component a fraction of the way, say half way, from its current

value toward its optimized value.

2. Stop revisiting elements of U and V well before the process has converged.

3. Take several different UV decompositions, and when predicting a new

entry in the matrix M , take the average of the results of using each

decomposition.

9.4.6

Exercises for Section 9.4

Exercise 9.4.1 : Starting with the decomposition of Fig. 9.10, we may choose

any of the 20 entries in U or V to optimize first. Perform this first optimization

step assuming we choose: (a) u 32 (b) v 41 .

Exercise 9.4.2 : If we wish to start out, as in Fig. 9.10, with all U and V

entries set to the same value, what value minimizes the RMSE for the matrix

M of our running example?

Exercise 9.4.3 :

Starting with the U and V

matrices in Fig. 9.16, do the

following in order:

(a) Reconsider the value of u 11 . Find its new best value, given the changes

that have been made so far.

(b) Then choose the best value for u 52 .

(c) Then choose the best value for v 22 .

Exercise 9.4.4 : Derive the formula for y (the optimum value of element v rs

given at the end of Section 9.4.4.

Exercise 9.4.5 : Normalize the matrix M of our running example by:

(a) First subtracting from each element the average of its row, and then

subtracting from each element the average of its (modified) column.

(b) First subtracting from each element the average of its column, and then

subtracting from each element the average of its (modified) row.

Are there any differences in the results of (a) and (b)?

Search WWH ::

Custom Search

Home