Database Reference
In-Depth Information
(f) How much of the energy of the original singular values is retained by the one-di-
mensional approximation?
EXERCISE
11.3.2 Use the SVD from
Fig. 11.7
. Suppose Leslie assigns rating 3 to
Alien
and
rating 4 to
Titanic
, giving us a representation of Leslie in “movie space” of [0, 3, 0, 0, 4].
Find the representation of Leslie in concept space. What does that representation predict
about how well Leslie would like the other movies appearing in our example data?
like. How would you use a similar technique to guess the people that would most like a
given movie, if all you had were the ratings of that movie by a few people?
11.4 CUR Decomposition
There is a problem with SVD that does not show up in the running example of
Section
sparse; that is, most entries are 0. For example, a matrix representing many documents (as
rows) and the words they contain (as columns) will be sparse, because most words are not
present in most documents. Similarly, a matrix of customers and products will be sparse
because most people do not buy most products.
We cannot deal with dense matrices that have millions or billions of rows and/or
agonal it will be sparse, but Σ is usually much smaller than
U
and
V
, so its sparseness does
not help.
In this section, we shall consider another approach to decomposition, called CUR-de-
composition. The merit of this approach lies in the fact that if
M
is sparse, then the two
large matrices (called
C
and
R
for “columns” and “rows”) analogous to
U
and
V
in SVD
are also sparse. Only the matrix in the middle (analogous to Σ in SVD) is dense, but this
matrix is small so the density does not hurt too much.
Unlike SVD, which gives an exact decomposition as long as the parameter
r
is taken to
be at least as great as the rank of the matrix
M
, CUR-decomposition is an approximation
no matter how large we make
r
. There is a theory that guarantees convergence to
M
as
r
gets larger, but typically you have to make
r
so large to get, say within 1% that the method
becomes impractical. Nevertheless, a decomposition with a relatively small value of
r
has
a good probability of being a useful and accurate decomposition.