Database Reference
In-Depth Information
EXAMPLE 9.4 Consider the same movie information as in Example 9.3 , but now suppose
the utility matrix has nonblank entries that are ratings in the 1-5 range. Suppose user U
gives an average rating of 3. There are three movies with Julia Roberts as an actor, and
those movies got ratings of 3, 4, and 5. Then in the user profile of U , the component for
Julia Roberts will have value that is the average of 3 − 3, 4 − 3, and 5 − 3, that is, a value
of 1.
On the other hand, user V gives an average rating of 4, and has also rated three movies
with Julia Roberts (it doesn't matter whether or not they are the same three movies U rated).
User V gives these three movies ratings of 2, 3, and 5. The user profile for V has, in the
component for Julia Roberts, the average of 2 − 4, 3 − 4, and 5 − 4, that is, the value
−2 / 3.
9.2.6
Recommending Items to Users Based on Content
With profile vectors for both users and items, we can estimate the degree to which a user
would prefer an item by computing the cosine distance between the user's and item's vec-
tors. As in Example 9.2 , we may wish to scale various components whose values are not
boolean. The random-hyperplane and locality-sensitive-hashing techniques can be used to
place (just) item profiles in buckets. In that way, given a user to whom we want to recom-
mend some items, we can apply the same two techniques - random hyperplanes and LSH
- to determine in which buckets we must look for items that might have a small cosine dis-
tance from the user.
EXAMPLE 9.5 Consider first the data of Example 9.3 . The user's profile will have compon-
ents for actors proportional to the likelihood that the actor will appear in a movie the user
likes. Thus, the highest recommendations (lowest cosine distance) belong to the movies
with lots of actors that appear in many of the movies the user likes. As long as actors are
the only information we have about features of movies, that is probably the best we can
do. 3
Now, consider Example 9.4 . There, we observed that the vector for a user will have pos-
itive numbers for actors that tend to appear in movies the user likes and negative numbers
for actors that tend to appear in movies the user doesn't like. Consider a movie with many
actors the user likes, and only a few or none that the user doesn't like. The cosine of the
angle between the user's and movie's vectors will be a large positive fraction. That implies
an angle close to 0, and therefore a small cosine distance between the vectors.
Next, consider a movie with about as many actors that the user likes as those the user
doesn't like. In this situation, the cosine of the angle between the user and movie is around
0, and therefore the angle between the two vectors is around 90 degrees. Finally, consider
Search WWH ::




Custom Search