Databases Reference
In-Depth Information
Avoiding Overfitting
One problem that often arises when performing a UV-decomposition is that we
arrive at one of the many local minima that conform well to the given data, but
picks up values in the data that don't reflect well the underlying process that
gives rise to the data. That is, although the RMSE may be small on the given
data, it doesn't do well predicting future data. There are several things that can
be done to cope with this problem, which is called overfitting by statisticians.
1. Avoid favoring the first components to be optimized by only moving the
value of a component a fraction of the way, say half way, from its current
value toward its optimized value.
2. Stop revisiting elements of U and V well before the process has converged.
3. Take several different UV decompositions, and when predicting a new
entry in the matrix M , take the average of the results of using each
decomposition.
9.4.6
Exercises for Section 9.4
Exercise 9.4.1 : Starting with the decomposition of Fig. 9.10, we may choose
any of the 20 entries in U or V to optimize first. Perform this first optimization
step assuming we choose: (a) u 32 (b) v 41 .
Exercise 9.4.2 : If we wish to start out, as in Fig. 9.10, with all U and V
entries set to the same value, what value minimizes the RMSE for the matrix
M of our running example?
Exercise 9.4.3 :
Starting with the U and V
matrices in Fig. 9.16, do the
following in order:
(a) Reconsider the value of u 11 . Find its new best value, given the changes
that have been made so far.
(b) Then choose the best value for u 52 .
(c) Then choose the best value for v 22 .
Exercise 9.4.4 : Derive the formula for y (the optimum value of element v rs
given at the end of Section 9.4.4.
Exercise 9.4.5 : Normalize the matrix M of our running example by:
(a) First subtracting from each element the average of its row, and then
subtracting from each element the average of its (modified) column.
(b) First subtracting from each element the average of its column, and then
subtracting from each element the average of its (modified) row.
Are there any differences in the results of (a) and (b)?
Search WWH ::




Custom Search