Database Reference
In-Depth Information
9.1.1
The Utility Matrix
In a recommendation-system application there are two classes of entities, which we shall
refer to as users and items . Users have preferences for certain items, and these preferences
must be teased out of the data. The data itself is represented as a utility matrix , giving for
each user-item pair, a value that represents what is known about the degree of preference
of that user for that item. Values come from an ordered set, e.g., integers 1-5 representing
the number of stars that the user gave as a rating for that item. We assume that the matrix is
sparse, meaning that most entries are “unknown.” An unknown rating implies that we have
no explicit information about the user's preference for the item.
EXAMPLE 9.1 In Fig. 9.1 we see an example utility matrix, representing users' ratings of
movies on a 1-5 scale, with 5 the highest rating. Blanks represent the situation where the
user has not rated the movie. The movie names are HP1, HP2, and HP3 for Harry Potter I,
II, and III, TW for Twilight , and SW1, SW2, and SW3 for Star Wars episodes 1, 2, and 3.
The users are represented by capital letters A through D .
Figure 9.1 A utility matrix representing ratings of movies on a 1-5 scale
Notice that most user-movie pairs have blanks, meaning the user has not rated the movie.
In practice, the matrix would be even sparser, with the typical user rating only a tiny frac-
tion of all available movies.
The goal of a recommendation system is to predict the blanks in the utility matrix. For
example, would user A like SW2? There is little evidence from the tiny matrix in Fig. 9.1 .
We might design our recommendation system to take into account properties of movies,
such as their producer, director, stars, or even the similarity of their names. If so, we might
then note the similarity between SW1 and SW2, and then conclude that since A did not
like SW1, they were unlikely to enjoy SW2 either. Alternatively, with much more data, we
might observe that the people who rated both SW1 and SW2 tended to give them similar
ratings. Thus, we could conclude that A would also give SW2 a low rating, similar to A 's
rating of SW1.
We should also be aware of a slightly different goal that makes sense in many applica-
tions. It is not necessary to predict every blank entry in a utility matrix. Rather, it is only
necessary to discover some entries in each row that are likely to be high. In most applic-
ations, the recommendation system does not offer users a ranking of all items, but rather
Search WWH ::




Custom Search