Database Reference
In-Depth Information
9.3.2
The Duality of Similarity
The utility matrix can be viewed as telling us about users or about items, or both. It is im-
portant to realize that any of the techniques we suggested in Section 9.3.1 for finding sim-
ilar users can be used on columns of the utility matrix to find similar items. There are two
ways in which the symmetry is broken in practice.
(1) We can use information about users to recommend items. That is, given a user, we can
find some number of the most similar users, perhaps using the techniques of Chapter
3 . We can base our recommendation on the decisions made by these similar users, e.g.,
recommend the items that the greatest number of them have purchased or rated highly.
However, there is no symmetry. Even if we find pairs of similar items, we need to take
an additional step in order to recommend items to users. This point is explored further
at the end of this subsection.
(2) There is a difference in the typical behavior of users and items, as it pertains to similar-
ity. Intuitively, items tend to be classifiable in simple terms. For example, music tends
to belong to a single genre. It is impossible, e.g., for a piece of music to be both 60's
rock and 1700's baroque. On the other hand, there are individuals who like both 60's
rock and 1700's baroque, and who buy examples of both types of music. The conse-
quence is that it is easier to discover items that are similar because they belong to the
same genre, than it is to detect that two users are similar because they prefer one genre
in common, while each also likes some genres that the other doesn't care for.
As we suggested in (1) above, one way of predicting the value of the utility-matrix entry
for user U and item I is to find the n users (for some predetermined n ) most similar to U and
average their ratings for item I , counting only those among the n similar users who have
rated I . It is generally better to normalize the matrix first. That is, for each of the n users
subtract their average rating for items from their rating for i . Average the difference for
those users who have rated I , and then add this average to the average rating that U gives
for all items. This normalization adjusts the estimate in the case that U tends to give very
high or very low ratings, or a large fraction of the similar users who rated I (of which there
may be only a few) are users who tend to rate very high or very low.
Dually, we can use item similarity to estimate the entry for user U and item I . Find the m
items most similar to I , for some m , and take the average rating, among the m items, of the
ratings that U has given. As for user-user similarity, we consider only those items among
the m that U has rated, and it is probably wise to normalize item ratings first.
Note that whichever approach to estimating entries in the utility matrix we use, it is not
sufficient to find only one entry. In order to recommend items to a user U , we need to es-
timate every entry in the row of the utility matrix for U , or at least find all or most of the
Search WWH ::




Custom Search