Database Reference
In-Depth Information
the concept science fiction. The value 0.14 in the first row and first column of U is smal-
ler than some of the other entries in that column, because while Joe watches only science
fiction, he doesn't rate those movies highly. The second column of the first row of U is 0,
because Joe doesn't rate romance movies at all.
The matrix V relates movies to concepts. The 0.58 in each of the first three columns of
the first row of V T indicates that the first three movies - The Matrix , Alien , and Star Wars -
each are of the science-fiction genre, while the 0's in the last two columns of the first row
say that these movies do not partake of the concept romance at all. Likewise, the second
row of V T tells us that the movies Casablanca and Titanic are exclusively romances.
Finally, the matrix Σ gives the strength of each of the concepts. In our example, the
strength of the science-fiction concept is 12.4, while the strength of the romance concept
is 9.5. Intuitively, the science-fiction concept is stronger because the data provides more
information about the movies of that genre and the people who like them.
In general, the concepts will not be so clearly delineated. There will be fewer 0's in U
and V , although Σ is always a diagonal matrix and will always have 0's off the diagonal.
The entities represented by the rows and columns of M (analogous to people and movies
in our example) will partake of several different concepts to varying degrees. In fact, the
decomposition of Example 11.8 was especially simple, since the rank of the matrix M was
equal to the desired number of columns of U , Σ, and V . We were therefore able to get an
exact decomposition of M with only two columns for each of the three matrices U , Σ, and V
; the product U Σ V T , if carried out to infinite precision, would be exactly M . In practice, life
is not so simple. When the rank of M is greater than the number of columns we want for the
matrices U , Σ, and V , the decomposition is not exact. We need to eliminate from the exact
decomposition those columns of U and V that correspond to the smallest singular values,
in order to get the best approximation. The following example is a slight modification of
Example 11.8 that will illustrate the point.
EXAMPLE 11.9 Figure 11.8 is almost the same as Fig. 11.6 , but Jill and Jane rated Alien ,
although neither liked it very much. The rank of the matrix in Fig. 11.8 is 3; for example
the first, sixth, and seventh rows are independent, but you can check that no four rows are
independent. Figure 11.9 shows the decomposition of the matrix from Fig. 11.8 .
Figure 11.8 The new matrix M ′, with ratings for Alien by two additional raters
Search WWH ::




Custom Search