Database Reference
In-Depth Information
(f) How much of the energy of the original singular values is retained by the one-di-
mensional approximation?
EXERCISE 11.3.2 Use the SVD from Fig. 11.7 . Suppose Leslie assigns rating 3 to Alien and
rating 4 to Titanic , giving us a representation of Leslie in “movie space” of [0, 3, 0, 0, 4].
Find the representation of Leslie in concept space. What does that representation predict
about how well Leslie would like the other movies appearing in our example data?
! EXERCISE 11.3.3 Demonstrate that the rank of the matrix in Fig. 11.8 is 3.
! EXERCISE 11.3.4 Section 11.3.5 showed how to guess the movies a person would most
like. How would you use a similar technique to guess the people that would most like a
given movie, if all you had were the ratings of that movie by a few people?
11.4 CUR Decomposition
There is a problem with SVD that does not show up in the running example of Section
11.3 . In large-data applications, it is normal for the matrix M being decomposed to be very
sparse; that is, most entries are 0. For example, a matrix representing many documents (as
rows) and the words they contain (as columns) will be sparse, because most words are not
present in most documents. Similarly, a matrix of customers and products will be sparse
because most people do not buy most products.
We cannot deal with dense matrices that have millions or billions of rows and/or
columns. However, with SVD, even if M is sparse, U and V will be dense. 4 Since Σ is di-
agonal it will be sparse, but Σ is usually much smaller than U and V , so its sparseness does
not help.
In this section, we shall consider another approach to decomposition, called CUR-de-
composition. The merit of this approach lies in the fact that if M is sparse, then the two
large matrices (called C and R for “columns” and “rows”) analogous to U and V in SVD
are also sparse. Only the matrix in the middle (analogous to Σ in SVD) is dense, but this
matrix is small so the density does not hurt too much.
Unlike SVD, which gives an exact decomposition as long as the parameter r is taken to
be at least as great as the rank of the matrix M , CUR-decomposition is an approximation
no matter how large we make r . There is a theory that guarantees convergence to M as r
gets larger, but typically you have to make r so large to get, say within 1% that the method
becomes impractical. Nevertheless, a decomposition with a relatively small value of r has
a good probability of being a useful and accurate decomposition.
Search WWH ::




Custom Search